1. 30 Jul, 2012 8 commits
    • Jeff Layton's avatar
      nfs: skip commit in releasepage if we're freeing memory for fs-related reasons · 5cf02d09
      Jeff Layton authored
      We've had some reports of a deadlock where rpciod ends up with a stack
      trace like this:
      
          PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
           #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
           #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
           #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
           #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
           #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
           #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
           #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
           #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
           #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
           #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
          #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
          #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
          #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
          #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
          #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
          #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
          #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
          #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
          #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
          #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
          #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
          #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
          #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
          #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
          #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
          #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca
      
      rpciod is trying to allocate memory for a new socket to talk to the
      server. The VM ends up calling ->releasepage to get more memory, and it
      tries to do a blocking commit. That commit can't succeed however without
      a connected socket, so we deadlock.
      
      Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
      socket allocation, and having nfs_release_page check for that flag when
      deciding whether to do a commit call. Also, set PF_FSTRANS
      unconditionally in rpc_async_schedule since that function can also do
      allocations sometimes.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      5cf02d09
    • Jeff Layton's avatar
      sunrpc: clarify comments on rpc_make_runnable · 506026c3
      Jeff Layton authored
      rpc_make_runnable is not generally called with the queue lock held, unless
      it's waking up a task that has been sitting on a waitqueue. This is safe
      when the task has not entered the FSM yet, but the comments don't really
      spell this out.
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      506026c3
    • Peng Tao's avatar
      pnfsblock: bail out partial page IO · 159e0561
      Peng Tao authored
      Current block layout driver read/write code assumes page
      aligned IO in many places. Add a checker to validate the assumption.
      Otherwise there would be data corruption like when application does
      open(O_WRONLY) and page unaliged write.
      Signed-off-by: default avatarPeng Tao <tao.peng@emc.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      159e0561
    • Jeff Layton's avatar
      nfs: fix fl_type tests in NFSv4 code · f44106e2
      Jeff Layton authored
      fl_type is not a bitmap.
      Reported-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      f44106e2
    • Fred Isaman's avatar
      NFS: fix pnfs regression with directio writes · c95908e4
      Fred Isaman authored
      Commit 57208fa7 "NFS: Create an write_pageio_init() function"
      did not modify the calls in direct.c, preventing direct io from
      using pnfs.  This reintroduces that capability.
      Signed-off-by: default avatarFred Isaman <iisaman@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      c95908e4
    • Fred Isaman's avatar
      NFS: fix pnfs regression with directio reads · 59948db3
      Fred Isaman authored
      Commit 1abb5088 "NFS: Create an read_pageio_init() function"
      did not modify the call in direct.c, preventing direct io from
      using pnfs.  This reintroduces that capability.
      Signed-off-by: default avatarFred Isaman <iisaman@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      59948db3
    • Joe Perches's avatar
      sunrpc: clnt: Add missing braces · cac5d07e
      Joe Perches authored
      Add a missing set of braces that commit 4e0038b6
      ("SUNRPC: Move clnt->cl_server into struct rpc_xprt")
      forgot.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org [>= 3.4]
      cac5d07e
    • Randy Dunlap's avatar
      nfs: fix stub return type warnings · 0add3e85
      Randy Dunlap authored
      Fix numerous repeated warnings by making the stub function
      void instead of non-void:
      
      fs/nfs/nfs4_fs.h: In function 'nfs4_unregister_sysctl':
      fs/nfs/nfs4_fs.h:385:1: warning: no return statement in function returning non-void
      Signed-off-by: default avatarRandy Dunlap <rdunlap@xenotime.net>
      Cc:	Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      0add3e85
  2. 17 Jul, 2012 14 commits
  3. 16 Jul, 2012 15 commits
    • Chuck Lever's avatar
      NFS: Clean up nfs4_proc_setclientid() and friends · 6bbb4ae8
      Chuck Lever authored
      Add documenting comments and appropriate debugging messages.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      6bbb4ae8
    • Chuck Lever's avatar
      NFS: Treat NFS4ERR_CLID_INUSE as a fatal error · de734831
      Chuck Lever authored
      For NFSv4 minor version 0, currently the cl_id_uniquifier allows the
      Linux client to generate a unique nfs_client_id4 string whenever a
      server replies with NFS4ERR_CLID_INUSE.
      
      This implementation seems to be based on a flawed reading of RFC
      3530.  NFS4ERR_CLID_INUSE actually means that the client has presented
      this nfs_client_id4 string with a different principal at some time in
      the past, and that lease is still in use on the server.
      
      For a Linux client this might be rather difficult to achieve: the
      authentication flavor is named right in the nfs_client_id4.id
      string.  If we change flavors, we change strings automatically.
      
      So, practically speaking, NFS4ERR_CLID_INUSE means there is some other
      client using our string.  There is not much that can be done to
      recover automatically.  Let's make it a permanent error.
      
      Remove the recovery logic in nfs4_proc_setclientid(), and remove the
      cl_id_uniquifier field from the nfs_client data structure.  And,
      remove the authentication flavor from the nfs_client_id4 string.
      
      Keeping the authentication flavor in the nfs_client_id4.id string
      means that we could have a separate lease for each authentication
      flavor used by mounts on the client.  But we want just one lease for
      all the mounts on this client.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      de734831
    • Chuck Lever's avatar
      NFS: When state recovery fails, waiting tasks should exit · 46a87b8a
      Chuck Lever authored
      NFSv4 state recovery is not always successful.  Failure is signalled
      by setting the nfs_client.cl_cons_state to a negative (errno) value,
      then waking waiters.
      
      Currently this can happen only during mount processing.  I'm about to
      add an explicit case where state recovery failure during normal
      operation should force all NFS requests waiting on that state recovery
      to exit.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      46a87b8a
    • Chuck Lever's avatar
      SUNRPC: Add rpcauth_list_flavors() · 6a1a1e34
      Chuck Lever authored
      The gss_mech_list_pseudoflavors() function provides a list of
      currently registered GSS pseudoflavors.  This list does not include
      any non-GSS flavors that have been registered with the RPC client.
      nfs4_find_root_sec() currently adds these extra flavors by hand.
      
      Instead, nfs4_find_root_sec() should be looking at the set of flavors
      that have been explicitly registered via rpcauth_register().  And,
      other areas of code will soon need the same kind of list that
      contains all flavors the kernel currently knows about (see below).
      
      Rather than cloning the open-coded logic in nfs4_find_root_sec() to
      those new places, introduce a generic RPC function that generates a
      full list of registered auth flavors and pseudoflavors.
      
      A new rpc_authops method is added that lists a flavor's
      pseudoflavors, if it has any.  I encountered an interesting module
      loader loop when I tried to get the RPC client to invoke
      gss_mech_list_pseudoflavors() by name.
      
      This patch is a pre-requisite for server trunking discovery, and a
      pre-requisite for fixing up the in-kernel mount client to do better
      automatic security flavor selection.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      6a1a1e34
    • Chuck Lever's avatar
      NFS: nfs_getaclargs.acl_len is a size_t · 56d08fef
      Chuck Lever authored
      Squelch compiler warnings:
      
      fs/nfs/nfs4proc.c: In function ‘__nfs4_get_acl_uncached’:
      fs/nfs/nfs4proc.c:3811:14: warning: comparison between signed and
      	unsigned integer expressions [-Wsign-compare]
      fs/nfs/nfs4proc.c:3818:15: warning: comparison between signed and
      	unsigned integer expressions [-Wsign-compare]
      
      Introduced by commit bf118a34 "NFSv4: include bitmap in nfsv4 get
      acl data", Dec 7, 2011.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      56d08fef
    • Chuck Lever's avatar
      NFS: Clean up TEST_STATEID and FREE_STATEID error reporting · 38527b15
      Chuck Lever authored
      As a finishing touch, add appropriate documenting comments and some
      debugging printk's.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      38527b15
    • Chuck Lever's avatar
      NFS: Clean up nfs41_check_expired_stateid() · 3e60ffdd
      Chuck Lever authored
      Clean up: Instead of open-coded flag manipulation, use test_bit() and
      clear_bit() just like all other accessors of the state->flag field.
      This also eliminates several unnecessary implicit integer type
      conversions.
      
      To make it absolutely clear what is going on, a number of comments
      are introduced.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      3e60ffdd
    • Chuck Lever's avatar
      NFS: State reclaim clears OPEN and LOCK state · eb64cf96
      Chuck Lever authored
      The "state->flags & flags" test in nfs41_check_expired_stateid()
      allows the state manager to squelch a TEST_STATEID operation when
      it is known for sure that a state ID is no longer valid.  If the
      lease was purged, for example, the client already knows that state
      ID is now defunct.
      
      But open recovery is still needed for that inode.
      
      To force a call to nfs4_open_expired(), change the default return
      value for nfs41_check_expired_stateid() to force open recovery, and
      the default return value for nfs41_check_locks() to force lock
      recovery, if the requested flags are clear.  Fix suggested by Bryan
      Schumaker.
      
      Also, the presence of a delegation state ID must not prevent normal
      open recovery.  The delegation state ID must be cleared if it was
      revoked, but once cleared I don't think it's presence or absence has
      any bearing on whether open recovery is still needed.  So the logic
      is adjusted to ignore the TEST_STATEID result for the delegation
      state ID.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      eb64cf96
    • Chuck Lever's avatar
      NFS: Don't free a state ID the server does not recognize · 89af2739
      Chuck Lever authored
      The result of a TEST_STATEID operation can indicate a few different
      things:
      
        o If NFS_OK is returned, then the client can continue using the
          state ID under test, and skip recovery.
      
        o RFC 5661 says that if the state ID was revoked, then the client
          must perform an explicit FREE_STATEID before trying to re-open.
      
        o If the server doesn't recognize the state ID at all, then no
          FREE_STATEID is needed, and the client can immediately continue
          with open recovery.
      
      Let's err on the side of caution: if the server clearly tells us the
      state ID is unknown, we skip the FREE_STATEID.  For any other error,
      we issue a FREE_STATEID.  Sometimes that FREE_STATEID will be
      unnecessary, but leaving unused state IDs on the server needlessly
      ties up resources.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      89af2739
    • Chuck Lever's avatar
      NFS: Fix up TEST_STATEID and FREE_STATEID return code handling · 377e507d
      Chuck Lever authored
      The TEST_STATEID and FREE_STATEID operations can return
      -NFS4ERR_BAD_STATEID, -NFS4ERR_OLD_STATEID, or -NFS4ERR_DEADSESSION.
      
      nfs41_{test,free}_stateid() should not pass these errors to
      nfs4_handle_exception() during state recovery, since that will
      recursively kick off state recovery again, resulting in a deadlock.
      
      In particular, when the TEST_STATEID operation returns NFS4_OK,
      res.status can contain one of these errors.  _nfs41_test_stateid()
      replaces NFS4_OK with the value in res.status, which is then returned
      to callers.
      
      But res.status is not passed through nfs4_stat_to_errno(), and thus is
      a positive NFS4ERR value.  Currently callers are only interested in
      !NFS4_OK, and nfs4_handle_exception() ignores positive values.
      
      Thus the res.status values are currently ignored by
      nfs4_handle_exception() and won't cause the deadlock above.  Thanks to
      this missing negative, it is only when these operations fail (which
      is very rare) that a deadlock can occur.
      
      Bryan agrees the original intent was to return res.status as a
      negative NFS4ERR value to callers of nfs41_test_stateid().
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      377e507d
    • Andy Adamson's avatar
      NFSv4.1 do not send LAYOUTRETURN on emtpy plh_segs list · 293b3b06
      Andy Adamson authored
      mark_matching_lsegs_invalid() resets the mds_threshold counters and can
      dereference the layout hdr on an initial empty plh_segs list. It returns 0 both
      in the case of an initial empty list and in a non-emtpy list that was cleared
      by calls to mark_lseg_invalid.
      
      Don't send a LAYOUTRETURN if the list was initially empty.
      Signed-off-by: default avatarAndy Adamson <andros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      293b3b06
    • Andy Adamson's avatar
      NFSv4.1 mark layout when already returned · 366d5052
      Andy Adamson authored
      When the file layout driver is fencing a DS, _pnfs_return_layout can be
      called mulitple times per inode due to in-flight i/o referencing lsegs on it's
      plh_segs list.
      
      Remember that LAYOUTRETURN has been called, and do not call it again.
      Allow LAYOUTRETURNs after a subsequent LAYOUTGET.
      Signed-off-by: default avatarAndy Adamson <andros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      366d5052
    • Andy Adamson's avatar
    • Andy Adamson's avatar
      NFSv4.1 return the LAYOUT for each file with failed DS connection I/O · 82c7c7a5
      Andy Adamson authored
      First mark the deviceid invalid to prevent any future use. Then fence all
      files involved in I/O to a DS with a connection error by sending a
      LAYOUTRETURN.
      Signed-off-by: default avatarAndy Adamson <andros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      82c7c7a5
    • Trond Myklebust's avatar
      Merge commit '9249e17f' into nfs-for-3.6 · 8626e4a4
      Trond Myklebust authored
      Resolve conflicts with the VFS atomic open and sget changes.
      
      Conflicts:
      	fs/nfs/nfs4proc.c
      8626e4a4
  4. 14 Jul, 2012 3 commits