1. 04 Apr, 2009 29 commits
  2. 30 Mar, 2009 1 commit
  3. 29 Mar, 2009 1 commit
  4. 27 Mar, 2009 3 commits
  5. 18 Mar, 2009 6 commits
    • Chuck Lever's avatar
      SUNRPC: Clean up static inline functions in svc_xprt.h · 2795e53b
      Chuck Lever authored
      Clean up:  Enable the use of const arguments in higher level svc_ APIs
      by adding const to the arguments of the helper functions in svc_xprt.h
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      2795e53b
    • Sachin S. Prabhu's avatar
      Inconsistent setattr behaviour · 0953e620
      Sachin S. Prabhu authored
      There is an inconsistency seen in the behaviour of nfs compared to other local
      filesystems on linux when changing owner or group of a directory. If the
      directory has SUID/SGID flags set, on changing owner or group on the directory,
      the flags are stripped off on nfs. These flags are maintained on other
      filesystems such as ext3.
      
      To reproduce on a nfs share or local filesystem, run the following commands
      mkdir test; chmod +s+g test; chown user1 test; ls -ld test
      
      On the nfs share, the flags are stripped and the output seen is
      drwxr-xr-x 2 user1 root 4096 Feb 23  2009 test
      
      On other local filesystems(ex: ext3), the flags are not stripped and the output
      seen is
      drwsr-sr-x 2 user1 root 4096 Feb 23 13:57 test
      
      chown_common() called from sys_chown() will only strip the flags if the inode is
      not a directory.
      static int chown_common(struct dentry * dentry, uid_t user, gid_t group)
      {
      ..
              if (!S_ISDIR(inode->i_mode))
                      newattrs.ia_valid |=
                              ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
      ..
      }
      
      See: http://www.opengroup.org/onlinepubs/7990989775/xsh/chown.html
      
      "If the path argument refers to a regular file, the set-user-ID (S_ISUID) and
      set-group-ID (S_ISGID) bits of the file mode are cleared upon successful return
      from chown(), unless the call is made by a process with appropriate privileges,
      in which case it is implementation-dependent whether these bits are altered. If
      chown() is successfully invoked on a file that is not a regular file, these
      bits may be cleared. These bits are defined in <sys/stat.h>."
      
      The behaviour as it stands does not appear to violate POSIX.  However the
      actions performed are inconsistent when comparing ext3 and nfs.
      Signed-off-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Acked-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      0953e620
    • Olga Kornievskaia's avatar
      svcrpc: take advantage of tcp autotuning · 47a14ef1
      Olga Kornievskaia authored
      Allow the NFSv4 server to make use of TCP autotuning behaviour, which
      was previously disabled by setting the sk_userlocks variable.
      
      Set the receive buffers to be big enough to receive the whole RPC
      request, and set this for the listening socket, not the accept socket.
      
      Remove the code that readjusts the receive/send buffer sizes for the
      accepted socket. Previously this code was used to influence the TCP
      window management behaviour, which is no longer needed when autotuning
      is enabled.
      
      This can improve IO bandwidth on networks with high bandwidth-delay
      products, where a large tcp window is required.  It also simplifies
      performance tuning, since getting adequate tcp buffers previously
      required increasing the number of nfsd threads.
      Signed-off-by: default avatarOlga Kornievskaia <aglo@citi.umich.edu>
      Cc: Jim Rees <rees@umich.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      47a14ef1
    • J. Bruce Fields's avatar
      nfsd4: don't check ip address in setclientid · 026722c2
      J. Bruce Fields authored
      The spec allows clients to change ip address, so we shouldn't be
      requiring that setclientid always come from the same address.  For
      example, a client could reboot and get a new dhcpd address, but still
      present the same clientid to the server.  In that case the server should
      revoke the client's previous state and allow it to continue, instead of
      (as it currently does) returning a CLID_INUSE error.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      026722c2
    • Greg Banks's avatar
      knfsd: add file to export stats about nfsd pools · 03cf6c9f
      Greg Banks authored
      Add /proc/fs/nfsd/pool_stats to export to userspace various
      statistics about the operation of rpc server thread pools.
      
      This patch is based on a forward-ported version of
      knfsd-add-pool-thread-stats which has been shipping in the SGI
      "Enhanced NFS" product since 2006 and which was previously
      posted:
      
      http://article.gmane.org/gmane.linux.nfs/10375
      
      It has also been updated thus:
      
       * moved EXPORT_SYMBOL() to near the function it exports
       * made the new struct struct seq_operations const
       * used SEQ_START_TOKEN instead of ((void *)1)
       * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
         printk in svc_pool_stats_*()" by Harshula Jayasuriya.
       * merged fix from SGI PV 964001 "Crash reading pool_stats before
         nfsds are started".
      Signed-off-by: default avatarGreg Banks <gnb@sgi.com>
      Signed-off-by: default avatarHarshula Jayasuriya <harshula@sgi.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      03cf6c9f
    • Greg Banks's avatar
      knfsd: avoid overloading the CPU scheduler with enormous load averages · 59a252ff
      Greg Banks authored
      Avoid overloading the CPU scheduler with enormous load averages
      when handling high call-rate NFS loads.  When the knfsd bottom half
      is made aware of an incoming call by the socket layer, it tries to
      choose an nfsd thread and wake it up.  As long as there are idle
      threads, one will be woken up.
      
      If there are lot of nfsd threads (a sensible configuration when
      the server is disk-bound or is running an HSM), there will be many
      more nfsd threads than CPUs to run them.  Under a high call-rate
      low service-time workload, the result is that almost every nfsd is
      runnable, but only a handful are actually able to run.  This situation
      causes two significant problems:
      
      1. The CPU scheduler takes over 10% of each CPU, which is robbing
         the nfsd threads of valuable CPU time.
      
      2. At a high enough load, the nfsd threads starve userspace threads
         of CPU time, to the point where daemons like portmap and rpc.mountd
         do not schedule for tens of seconds at a time.  Clients attempting
         to mount an NFS filesystem timeout at the very first step (opening
         a TCP connection to portmap) because portmap cannot wake up from
         select() and call accept() in time.
      
      Disclaimer: these effects were observed on a SLES9 kernel, modern
      kernels' schedulers may behave more gracefully.
      
      The solution is simple: keep in each svc_pool a counter of the number
      of threads which have been woken but have not yet run, and do not wake
      any more if that count reaches an arbitrary small threshold.
      
      Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
      synthetic client threads simulating an rsync (i.e. recursive directory
      listing) workload reading from an i386 RH9 install image (161480
      regular files in 10841 directories) on the server.  That tree is small
      enough to fill in the server's RAM so no disk traffic was involved.
      This setup gives a sustained call rate in excess of 60000 calls/sec
      before being CPU-bound on the server.  The server was running 128 nfsds.
      
      Profiling showed schedule() taking 6.7% of every CPU, and __wake_up()
      taking 5.2%.  This patch drops those contributions to 3.0% and 2.2%.
      Load average was over 120 before the patch, and 20.9 after.
      
      This patch is a forward-ported version of knfsd-avoid-nfsd-overload
      which has been shipping in the SGI "Enhanced NFS" product since 2006.
      It has been posted before:
      
      http://article.gmane.org/gmane.linux.nfs/10374Signed-off-by: default avatarGreg Banks <gnb@sgi.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      59a252ff