1. 25 Nov, 2019 40 commits
    • Paulo Alcantara (SUSE)'s avatar
      cifs: Always update signing key of first channel · ff6b6f3f
      Paulo Alcantara (SUSE) authored
      Update signing key of first channel whenever generating the master
      sigining/encryption/decryption keys rather than only in cifs_mount().
      
      This also fixes reconnect when re-establishing smb sessions to other
      servers.
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      ff6b6f3f
    • Paulo Alcantara (SUSE)'s avatar
      cifs: Fix retrieval of DFS referrals in cifs_mount() · 5bb30a4d
      Paulo Alcantara (SUSE) authored
      Make sure that DFS referrals are sent to newly resolved root targets
      as in a multi tier DFS setup.
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Link: https://lkml.kernel.org/r/05aa2995-e85e-0ff4-d003-5bb08bd17a22@canonical.com
      Cc: stable@vger.kernel.org
      Tested-by: default avatarMatthew Ruffell <matthew.ruffell@canonical.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      5bb30a4d
    • Paulo Alcantara (SUSE)'s avatar
      cifs: Fix potential softlockups while refreshing DFS cache · 84a1f5b1
      Paulo Alcantara (SUSE) authored
      We used to skip reconnects on all SMB2_IOCTL commands due to SMB3+
      FSCTL_VALIDATE_NEGOTIATE_INFO - which made sense since we're still
      establishing a SMB session.
      
      However, when refresh_cache_worker() calls smb2_get_dfs_refer() and
      we're under reconnect, SMB2_ioctl() will not be able to get a proper
      status error (e.g. -EHOSTDOWN in case we failed to reconnect) but an
      -EAGAIN from cifs_send_recv() thus looping forever in
      refresh_cache_worker().
      
      Fixes: e99c63e4 ("SMB3: Fix deadlock in validate negotiate hits reconnect")
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Suggested-by: default avatarAurelien Aptel <aaptel@suse.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      84a1f5b1
    • Paulo Alcantara (SUSE)'s avatar
      cifs: Fix lookup of root ses in DFS referral cache · df3df923
      Paulo Alcantara (SUSE) authored
      We don't care about module aliasing validation in
      cifs_compose_mount_options(..., is_smb3) when finding the root SMB
      session of an DFS namespace in order to refresh DFS referral cache.
      
      The following issue has been observed when mounting with '-t smb3' and
      then specifying 'vers=2.0':
      
      ...
      Nov 08 15:27:08 tw kernel: address conversion returned 0 for FS0.WIN.LOCAL
      Nov 08 15:27:08 tw kernel: [kworke] ==> dns_query((null),FS0.WIN.LOCAL,13,(null))
      Nov 08 15:27:08 tw kernel: [kworke] call request_key(,FS0.WIN.LOCAL,)
      Nov 08 15:27:08 tw kernel: [kworke] ==> dns_resolver_cmp(FS0.WIN.LOCAL,FS0.WIN.LOCAL)
      Nov 08 15:27:08 tw kernel: [kworke] <== dns_resolver_cmp() = 1
      Nov 08 15:27:08 tw kernel: [kworke] <== dns_query() = 13
      Nov 08 15:27:08 tw kernel: fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: resolved: FS0.WIN.LOCAL to 192.168.30.26
      ===> Nov 08 15:27:08 tw kernel: CIFS VFS: vers=2.0 not permitted when mounting with smb3
      Nov 08 15:27:08 tw kernel: fs/cifs/dfs_cache.c: CIFS VFS: leaving refresh_tcon (xid = 26) rc = -22
      ...
      
      Fixes: 5072010c ("cifs: Fix DFS cache refresher for DFS links")
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      df3df923
    • Paulo Alcantara (SUSE)'s avatar
      cifs: Fix use-after-free bug in cifs_reconnect() · 8354d88e
      Paulo Alcantara (SUSE) authored
      Ensure we grab an active reference in cifs superblock while doing
      failover to prevent automounts (DFS links) of expiring and then
      destroying the superblock pointer.
      
      This patch fixes the following KASAN report:
      
      [  464.301462] BUG: KASAN: use-after-free in
      cifs_reconnect+0x6ab/0x1350
      [  464.303052] Read of size 8 at addr ffff888155e580d0 by task
      cifsd/1107
      
      [  464.304682] CPU: 3 PID: 1107 Comm: cifsd Not tainted 5.4.0-rc4+ #13
      [  464.305552] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
      BIOS rel-1.12.1-0-ga5cab58-rebuilt.opensuse.org 04/01/2014
      [  464.307146] Call Trace:
      [  464.307875]  dump_stack+0x5b/0x90
      [  464.308631]  print_address_description.constprop.0+0x16/0x200
      [  464.309478]  ? cifs_reconnect+0x6ab/0x1350
      [  464.310253]  ? cifs_reconnect+0x6ab/0x1350
      [  464.311040]  __kasan_report.cold+0x1a/0x41
      [  464.311811]  ? cifs_reconnect+0x6ab/0x1350
      [  464.312563]  kasan_report+0xe/0x20
      [  464.313300]  cifs_reconnect+0x6ab/0x1350
      [  464.314062]  ? extract_hostname.part.0+0x90/0x90
      [  464.314829]  ? printk+0xad/0xde
      [  464.315525]  ? _raw_spin_lock+0x7c/0xd0
      [  464.316252]  ? _raw_read_lock_irq+0x40/0x40
      [  464.316961]  ? ___ratelimit+0xed/0x182
      [  464.317655]  cifs_readv_from_socket+0x289/0x3b0
      [  464.318386]  cifs_read_from_socket+0x98/0xd0
      [  464.319078]  ? cifs_readv_from_socket+0x3b0/0x3b0
      [  464.319782]  ? try_to_wake_up+0x43c/0xa90
      [  464.320463]  ? cifs_small_buf_get+0x4b/0x60
      [  464.321173]  ? allocate_buffers+0x98/0x1a0
      [  464.321856]  cifs_demultiplex_thread+0x218/0x14a0
      [  464.322558]  ? cifs_handle_standard+0x270/0x270
      [  464.323237]  ? __switch_to_asm+0x40/0x70
      [  464.323893]  ? __switch_to_asm+0x34/0x70
      [  464.324554]  ? __switch_to_asm+0x40/0x70
      [  464.325226]  ? __switch_to_asm+0x40/0x70
      [  464.325863]  ? __switch_to_asm+0x34/0x70
      [  464.326505]  ? __switch_to_asm+0x40/0x70
      [  464.327161]  ? __switch_to_asm+0x34/0x70
      [  464.327784]  ? finish_task_switch+0xa1/0x330
      [  464.328414]  ? __switch_to+0x363/0x640
      [  464.329044]  ? __schedule+0x575/0xaf0
      [  464.329655]  ? _raw_spin_lock_irqsave+0x82/0xe0
      [  464.330301]  kthread+0x1a3/0x1f0
      [  464.330884]  ? cifs_handle_standard+0x270/0x270
      [  464.331624]  ? kthread_create_on_node+0xd0/0xd0
      [  464.332347]  ret_from_fork+0x35/0x40
      
      [  464.333577] Allocated by task 1110:
      [  464.334381]  save_stack+0x1b/0x80
      [  464.335123]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [  464.335848]  cifs_smb3_do_mount+0xd4/0xb00
      [  464.336619]  legacy_get_tree+0x6b/0xa0
      [  464.337235]  vfs_get_tree+0x41/0x110
      [  464.337975]  fc_mount+0xa/0x40
      [  464.338557]  vfs_kern_mount.part.0+0x6c/0x80
      [  464.339227]  cifs_dfs_d_automount+0x336/0xd29
      [  464.339846]  follow_managed+0x1b1/0x450
      [  464.340449]  lookup_fast+0x231/0x4a0
      [  464.341039]  path_openat+0x240/0x1fd0
      [  464.341634]  do_filp_open+0x126/0x1c0
      [  464.342277]  do_sys_open+0x1eb/0x2c0
      [  464.342957]  do_syscall_64+0x5e/0x190
      [  464.343555]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      [  464.344772] Freed by task 0:
      [  464.345347]  save_stack+0x1b/0x80
      [  464.345966]  __kasan_slab_free+0x12c/0x170
      [  464.346576]  kfree+0xa6/0x270
      [  464.347211]  rcu_core+0x39c/0xc80
      [  464.347800]  __do_softirq+0x10d/0x3da
      
      [  464.348919] The buggy address belongs to the object at
      ffff888155e58000
                      which belongs to the cache kmalloc-256 of size 256
      [  464.350222] The buggy address is located 208 bytes inside of
                      256-byte region [ffff888155e58000, ffff888155e58100)
      [  464.351575] The buggy address belongs to the page:
      [  464.352333] page:ffffea0005579600 refcount:1 mapcount:0
      mapping:ffff88815a803400 index:0x0 compound_mapcount: 0
      [  464.353583] flags: 0x200000000010200(slab|head)
      [  464.354209] raw: 0200000000010200 ffffea0005576200 0000000400000004
      ffff88815a803400
      [  464.355353] raw: 0000000000000000 0000000080100010 00000001ffffffff
      0000000000000000
      [  464.356458] page dumped because: kasan: bad access detected
      
      [  464.367005] Memory state around the buggy address:
      [  464.367787]  ffff888155e57f80: fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc fc
      [  464.368877]  ffff888155e58000: fb fb fb fb fb fb fb fb fb fb fb fb
      fb fb fb fb
      [  464.369967] >ffff888155e58080: fb fb fb fb fb fb fb fb fb fb fb fb
      fb fb fb fb
      [  464.371111]                                                  ^
      [  464.371775]  ffff888155e58100: fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc fc
      [  464.372893]  ffff888155e58180: fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc fc
      [  464.373983] ==================================================================
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      8354d88e
    • Aurelien Aptel's avatar
      cifs: dump channel info in DebugData · 85150929
      Aurelien Aptel authored
      * show server&TCP states for extra channels
      * mention if an interface has a channel connected to it
      
      In this version three of the patch, fixed minor printk format
      issue pointed out by the kbuild robot.
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      85150929
    • Steve French's avatar
      smb3: dump in_send and num_waiters stats counters by default · 1ae9a5a5
      Steve French authored
      Number of requests in_send and the number of waiters on sendRecv
      are useful counters in various cases, move them from
      CONFIG_CIFS_STATS2 to be on by default especially with multichannel
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Acked-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      1ae9a5a5
    • Aurelien Aptel's avatar
      cifs: try harder to open new channels · 65a37a34
      Aurelien Aptel authored
      Previously we would only loop over the iface list once.
      This patch tries to loop over multiple times until all channels are
      opened. It will also try to reuse RSS ifaces.
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      65a37a34
    • Pavel Shilovsky's avatar
      CIFS: Properly process SMB3 lease breaks · 9bd45408
      Pavel Shilovsky authored
      Currenly we doesn't assume that a server may break a lease
      from RWH to RW which causes us setting a wrong lease state
      on a file and thus mistakenly flushing data and byte-range
      locks and purging cached data on the client. This leads to
      performance degradation because subsequent IOs go directly
      to the server.
      
      Fix this by propagating new lease state and epoch values
      to the oplock break handler through cifsFileInfo structure
      and removing the use of cifsInodeInfo flags for that. It
      allows to avoid some races of several lease/oplock breaks
      using those flags in parallel.
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      9bd45408
    • Ronnie Sahlberg's avatar
      cifs: move cifsFileInfo_put logic into a work-queue · 32546a95
      Ronnie Sahlberg authored
      This patch moves the final part of the cifsFileInfo_put() logic where we
      need a write lock on lock_sem to be processed in a separate thread that
      holds no other locks.
      This is to prevent deadlocks like the one below:
      
      > there are 6 processes looping to while trying to down_write
      > cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
      > cifs_new_fileinfo
      >
      > and there are 5 other processes which are blocked, several of them
      > waiting on either PG_writeback or PG_locked (which are both set), all
      > for the same page of the file
      >
      > 2 inode_lock() (inode->i_rwsem) for the file
      > 1 wait_on_page_writeback() for the page
      > 1 down_read(inode->i_rwsem) for the inode of the directory
      > 1 inode_lock()(inode->i_rwsem) for the inode of the directory
      > 1 __lock_page
      >
      >
      > so processes are blocked waiting on:
      >   page flags PG_locked and PG_writeback for one specific page
      >   inode->i_rwsem for the directory
      >   inode->i_rwsem for the file
      >   cifsInodeInflock_sem
      >
      >
      >
      > here are the more gory details (let me know if I need to provide
      > anything more/better):
      >
      > [0 00:48:22.765] [UN]  PID: 8863   TASK: ffff8c691547c5c0  CPU: 3
      > COMMAND: "reopen_file"
      >  #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
      >  #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
      >  #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
      >  #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
      >  #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
      >  #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
      >  #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
      > * (I think legitimize_path is bogus)
      >
      > in path_openat
      >         } else {
      >                 const char *s = path_init(nd, flags);
      >                 while (!(error = link_path_walk(s, nd)) &&
      >                         (error = do_last(nd, file, op)) > 0) {  <<<<
      >
      > do_last:
      >         if (open_flag & O_CREAT)
      >                 inode_lock(dir->d_inode);  <<<<
      >         else
      > so it's trying to take inode->i_rwsem for the directory
      >
      >      DENTRY           INODE           SUPERBLK     TYPE PATH
      > ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR  /mnt/vm1_smb/
      > inode.i_rwsem is ffff8c691158efc0
      >
      > <struct rw_semaphore 0xffff8c691158efc0>:
      >         owner: <struct task_struct 0xffff8c6914275d00> (UN -   8856 -
      > reopen_file), counter: 0x0000000000000003
      >         waitlist: 2
      >         0xffff9965007e3c90     8863   reopen_file      UN 0  1:29:22.926
      >   RWSEM_WAITING_FOR_WRITE
      >         0xffff996500393e00     9802   ls               UN 0  1:17:26.700
      >   RWSEM_WAITING_FOR_READ
      >
      >
      > the owner of the inode.i_rwsem of the directory is:
      >
      > [0 00:00:00.109] [UN]  PID: 8856   TASK: ffff8c6914275d00  CPU: 3
      > COMMAND: "reopen_file"
      >  #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
      >  #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
      >  #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
      >  #3 [ffff99650065b940] msleep at ffffffff9af573a9
      >  #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
      >  #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
      >  #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
      >  #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
      >  #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
      >  #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
      > #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
      > #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
      > #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
      > #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
      > #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
      >
      > cifs_launder_page is for page 0xffffd1e2c07d2480
      >
      > crash> page.index,mapping,flags 0xffffd1e2c07d2480
      >       index = 0x8
      >       mapping = 0xffff8c68f3cd0db0
      >   flags = 0xfffffc0008095
      >
      >   PAGE-FLAG       BIT  VALUE
      >   PG_locked         0  0000001
      >   PG_uptodate       2  0000004
      >   PG_lru            4  0000010
      >   PG_waiters        7  0000080
      >   PG_writeback     15  0008000
      >
      >
      > inode is ffff8c68f3cd0c40
      > inode.i_rwsem is ffff8c68f3cd0ce0
      >      DENTRY           INODE           SUPERBLK     TYPE PATH
      > ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
      > /mnt/vm1_smb/testfile.8853
      >
      >
      > this process holds the inode->i_rwsem for the parent directory, is
      > laundering a page attached to the inode of the file it's opening, and in
      > _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
      > for the file itself.
      >
      >
      > <struct rw_semaphore 0xffff8c68f3cd0ce0>:
      >         owner: <struct task_struct 0xffff8c6914272e80> (UN -   8854 -
      > reopen_file), counter: 0x0000000000000003
      >         waitlist: 1
      >         0xffff9965005dfd80     8855   reopen_file      UN 0  1:29:22.912
      >   RWSEM_WAITING_FOR_WRITE
      >
      > this is the inode.i_rwsem for the file
      >
      > the owner:
      >
      > [0 00:48:22.739] [UN]  PID: 8854   TASK: ffff8c6914272e80  CPU: 2
      > COMMAND: "reopen_file"
      >  #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
      >  #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
      >  #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
      >  #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
      >  #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
      >  #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
      >  #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
      >  #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
      >  #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
      >  #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
      > #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
      > #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
      > #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
      > #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
      >
      > the process holds the inode->i_rwsem for the file to which it's writing,
      > and is trying to __lock_page for the same page as in the other processes
      >
      >
      > the other tasks:
      > [0 00:00:00.028] [UN]  PID: 8859   TASK: ffff8c6915479740  CPU: 2
      > COMMAND: "reopen_file"
      >  #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
      >  #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
      >  #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
      >  #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
      >  #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
      >  #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
      >  #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
      >  #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
      >  #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
      > #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
      >
      > this is opening the file, and is trying to down_write cinode->lock_sem
      >
      >
      > [0 00:00:00.041] [UN]  PID: 8860   TASK: ffff8c691547ae80  CPU: 2
      > COMMAND: "reopen_file"
      > [0 00:00:00.057] [UN]  PID: 8861   TASK: ffff8c6915478000  CPU: 3
      > COMMAND: "reopen_file"
      > [0 00:00:00.059] [UN]  PID: 8858   TASK: ffff8c6914271740  CPU: 2
      > COMMAND: "reopen_file"
      > [0 00:00:00.109] [UN]  PID: 8862   TASK: ffff8c691547dd00  CPU: 6
      > COMMAND: "reopen_file"
      >  #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
      >  #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
      >  #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
      >  #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
      >  #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
      >  #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
      >  #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
      >  #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
      >  #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
      >
      > closing the file, and trying to down_write cifsi->lock_sem
      >
      >
      > [0 00:48:22.839] [UN]  PID: 8857   TASK: ffff8c6914270000  CPU: 7
      > COMMAND: "reopen_file"
      >  #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
      >  #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
      >  #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
      >  #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
      >  #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
      >  #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
      >  #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
      >  #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
      >  #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
      >
      > in __filemap_fdatawait_range
      >                         wait_on_page_writeback(page);
      > for the same page of the file
      >
      >
      >
      > [0 00:48:22.718] [UN]  PID: 8855   TASK: ffff8c69142745c0  CPU: 7
      > COMMAND: "reopen_file"
      >  #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
      >  #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
      >  #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
      >  #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
      >  #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
      >  #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
      >  #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
      >  #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
      >
      >         inode_lock(inode);
      >
      >
      > and one 'ls' later on, to see whether the rest of the mount is available
      > (the test file is in the root, so we get blocked up on the directory
      > ->i_rwsem), so the entire mount is unavailable
      >
      > [0 00:36:26.473] [UN]  PID: 9802   TASK: ffff8c691436ae80  CPU: 4
      > COMMAND: "ls"
      >  #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
      >  #1 [ffff996500393db8] schedule at ffffffff9b6e64df
      >  #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
      >  #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
      >  #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
      >  #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
      >  #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
      >  #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
      >
      > in iterate_dir:
      >         if (shared)
      >                 res = down_read_killable(&inode->i_rwsem);  <<<<
      >         else
      >                 res = down_write_killable(&inode->i_rwsem);
      >
      Reported-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      32546a95
    • Aurelien Aptel's avatar
      cifs: try opening channels after mounting · d70e9fa5
      Aurelien Aptel authored
      After doing mount() successfully we call cifs_try_adding_channels()
      which will open as many channels as it can.
      
      Channels are closed when the master session is closed.
      
      The master connection becomes the first channel.
      
      ,-------------> global cifs_tcp_ses_list <-------------------------.
      |                                                                  |
      '- TCP_Server_Info  <-->  TCP_Server_Info  <-->  TCP_Server_Info <-'
            (master con)           (chan#1 con)         (chan#2 con)
            |      ^                    ^                    ^
            v      '--------------------|--------------------'
         cifs_ses                       |
         - chan_count = 3               |
         - chans[] ---------------------'
         - smb3signingkey[]
            (master signing key)
      
      Note how channel connections don't have sessions. That's because
      cifs_ses can only be part of one linked list (list_head are internal
      to the elements).
      
      For signing keys, each channel has its own signing key which must be
      used only after the channel has been bound. While it's binding it must
      use the master session signing key.
      
      For encryption keys, since channel connections do not have sessions
      attached we must now find matching session by looping over all sessions
      in smb2_get_enc_key().
      
      Each channel is opened like a regular server connection but at the
      session setup request step it must set the
      SMB2_SESSION_REQ_FLAG_BINDING flag and use the session id to bind to.
      
      Finally, while sending in compound_send_recv() for requests that
      aren't negprot, ses-setup or binding related, use a channel by cycling
      through the available ones (round-robin).
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      d70e9fa5
    • Aurelien Aptel's avatar
      CIFS: refactor cifs_get_inode_info() · b8f7442b
      Aurelien Aptel authored
      Make logic of cifs_get_inode() much clearer by moving code to sub
      functions and adding comments.
      
      Document the steps this function does.
      
      cifs_get_inode_info() gets and updates a file inode metadata from its
      file path.
      
      * If caller already has raw info data from server they can pass it.
      * If inode already exists (just need to update) caller can pass it.
      
      Step 1: get raw data from server if none was passed
      Step 2: parse raw data into intermediate internal cifs_fattr struct
      Step 3: set fattr uniqueid which is later used for inode number. This
              can sometime be done from raw data
      Step 4: tweak fattr according to mount options (file_mode, acl to mode
              bits, uid, gid, etc)
      Step 5: update or create inode from final fattr struct
      
      * add is_smb1_server() helper
      * add is_inode_cache_good() helper
      * move SMB1-backupcreds-getinfo-retry to separate func
        cifs_backup_query_path_info().
      * move set-uniqueid code to separate func cifs_set_fattr_ino()
      * don't clobber uniqueid from backup cred retry
      * fix some probable corner cases memleaks
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      b8f7442b
    • Aurelien Aptel's avatar
      cifs: switch servers depending on binding state · f6a6bf7c
      Aurelien Aptel authored
      Currently a lot of the code to initialize a connection & session uses
      the cifs_ses as input. But depending on if we are opening a new session
      or a new channel we need to use different server pointers.
      
      Add a "binding" flag in cifs_ses and a helper function that returns
      the server ptr a session should use (only in the sess establishment
      code path).
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      f6a6bf7c
    • Aurelien Aptel's avatar
      cifs: add server param · f780bd3f
      Aurelien Aptel authored
      As we get down to the transport layer, plenty of functions are passed
      the session pointer and assume the transport to use is ses->server.
      
      Instead we modify those functions to pass (ses, server) so that we
      can decouple the session from the server.
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      f780bd3f
    • Aurelien Aptel's avatar
      cifs: add multichannel mount options and data structs · bcc88801
      Aurelien Aptel authored
      adds:
      - [no]multichannel to enable/disable multichannel
      - max_channels=N to control how many channels to create
      
      these options are then stored in the volume struct.
      
      - store channels and max_channels in cifs_ses
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      bcc88801
    • Aurelien Aptel's avatar
      cifs: sort interface list by speed · 35adffed
      Aurelien Aptel authored
      New channels are going to be opened by walking the list sequentially,
      so by sorting it we will connect to the fastest interfaces first.
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      35adffed
    • Pavel Shilovsky's avatar
      CIFS: Fix SMB2 oplock break processing · fa9c2362
      Pavel Shilovsky authored
      Even when mounting modern protocol version the server may be
      configured without supporting SMB2.1 leases and the client
      uses SMB2 oplock to optimize IO performance through local caching.
      
      However there is a problem in oplock break handling that leads
      to missing a break notification on the client who has a file
      opened. It latter causes big latencies to other clients that
      are trying to open the same file.
      
      The problem reproduces when there are multiple shares from the
      same server mounted on the client. The processing code tries to
      match persistent and volatile file ids from the break notification
      with an open file but it skips all share besides the first one.
      Fix this by looking up in all shares belonging to the server that
      issued the oplock break.
      
      Cc: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      fa9c2362
    • Ronnie Sahlberg's avatar
      cifs: don't use 'pre:' for MODULE_SOFTDEP · 3591bb83
      Ronnie Sahlberg authored
      It can cause
      to fail with
      modprobe: FATAL: Module <module> is builtin.
      
      RHBZ: 1767094
      Signed-off-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      3591bb83
    • Long Li's avatar
      cifs: smbd: Return -EAGAIN when transport is reconnecting · 4357d45f
      Long Li authored
      During reconnecting, the transport may have already been destroyed and is in
      the process being reconnected. In this case, return -EAGAIN to not fail and
      to retry this I/O.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      4357d45f
    • Long Li's avatar
      cifs: smbd: Only queue work for error recovery on memory registration · c21ce58e
      Long Li authored
      It's not necessary to queue invalidated memory registration to work queue, as
      all we need to do is to unmap the SG and make it usable again. This can save
      CPU cycles in normal data paths as memory registration errors are rare and
      normally only happens during reconnection.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      c21ce58e
    • Ronnie Sahlberg's avatar
      smb3: add debug messages for closing unmatched open · 87bc2376
      Ronnie Sahlberg authored
      Helps distinguish between an interrupted close and a truly
      unmatched open.
      Signed-off-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      87bc2376
    • Pavel Shilovsky's avatar
      CIFS: Do not miss cancelled OPEN responses · 7b71843f
      Pavel Shilovsky authored
      When an OPEN command is cancelled we mark a mid as
      cancelled and let the demultiplex thread process it
      by closing an open handle. The problem is there is
      a race between a system call thread and the demultiplex
      thread and there may be a situation when the mid has
      been already processed before it is set as cancelled.
      
      Fix this by processing cancelled requests when mids
      are being destroyed which means that there is only
      one thread referencing a particular mid. Also set
      mids as cancelled unconditionally on their state.
      
      Cc: Stable <stable@vger.kernel.org>
      Tested-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      7b71843f
    • Pavel Shilovsky's avatar
      CIFS: Fix NULL pointer dereference in mid callback · 86a7964b
      Pavel Shilovsky authored
      There is a race between a system call processing thread
      and the demultiplex thread when mid->resp_buf becomes NULL
      and later is being accessed to get credits. It happens when
      the 1st thread wakes up before a mid callback is called in
      the 2nd one but the mid state has already been set to
      MID_RESPONSE_RECEIVED. This causes NULL pointer dereference
      in mid callback.
      
      Fix this by saving credits from the response before we
      update the mid state and then use this value in the mid
      callback rather then accessing a response buffer.
      
      Cc: Stable <stable@vger.kernel.org>
      Fixes: ee258d79 ("CIFS: Move credit processing to mid callbacks for SMB3")
      Tested-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      86a7964b
    • Pavel Shilovsky's avatar
      CIFS: Close open handle after interrupted close · 9150c3ad
      Pavel Shilovsky authored
      If Close command is interrupted before sending a request
      to the server the client ends up leaking an open file
      handle. This wastes server resources and can potentially
      block applications that try to remove the file or any
      directory containing this file.
      
      Fix this by putting the close command into a worker queue,
      so another thread retries it later.
      
      Cc: Stable <stable@vger.kernel.org>
      Tested-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      9150c3ad
    • Pavel Shilovsky's avatar
      CIFS: Respect O_SYNC and O_DIRECT flags during reconnect · 44805b0e
      Pavel Shilovsky authored
      Currently the client translates O_SYNC and O_DIRECT flags
      into corresponding SMB create options when openning a file.
      The problem is that on reconnect when the file is being
      re-opened the client doesn't set those flags and it causes
      a server to reject re-open requests because create options
      don't match. The latter means that any subsequent system
      call against that open file fail until a share is re-mounted.
      
      Fix this by properly setting SMB create options when
      re-openning files after reconnects.
      
      Fixes: 1013e760: ("SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags")
      Cc: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      44805b0e
    • Steve French's avatar
      smb3: remove confusing dmesg when mounting with encryption ("seal") · 037d0507
      Steve French authored
      The smb2/smb3 message checking code was logging to dmesg when mounting
      with encryption ("seal") for compounded SMB3 requests.  When encrypted
      the whole frame (including potentially multiple compounds) is read
      so the length field is longer than in the case of non-encrypted
      case (where length field will match the the calculated length for
      the particular SMB3 request in the compound being validated).
      
      Avoids the warning on mount (with "seal"):
      
         "srv rsp padded more than expected. Length 384 not ..."
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      037d0507
    • Ronnie Sahlberg's avatar
    • Markus Elfring's avatar
      CIFS: Return directly after a failed build_path_from_dentry() in cifs_do_create() · 598b6c57
      Markus Elfring authored
      Return directly after a call of the function "build_path_from_dentry"
      failed at the beginning.
      Signed-off-by: default avatarMarkus Elfring <elfring@users.sourceforge.net>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      598b6c57
    • Markus Elfring's avatar
      CIFS: Use common error handling code in smb2_ioctl_query_info() · 2b1116bb
      Markus Elfring authored
      Move the same error code assignments so that such exception handling
      can be better reused at the end of this function.
      
      This issue was detected by using the Coccinelle software.
      Signed-off-by: default avatarMarkus Elfring <elfring@users.sourceforge.net>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      2b1116bb
    • Markus Elfring's avatar
      CIFS: Use memdup_user() rather than duplicating its implementation · cfaa1181
      Markus Elfring authored
      Reuse existing functionality from memdup_user() instead of keeping
      duplicate source code.
      
      Generated by: scripts/coccinelle/api/memdup_user.cocci
      
      Fixes: f5b05d62 ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
      Signed-off-by: default avatarMarkus Elfring <elfring@users.sourceforge.net>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      cfaa1181
    • Long Li's avatar
      cifs: smbd: Return -ECONNABORTED when trasnport is not in connected state · acd4680e
      Long Li authored
      The transport should return this error so the upper layer will reconnect.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      acd4680e
    • Long Li's avatar
      cifs: smbd: Add messages on RDMA session destroy and reconnection · d63cdbae
      Long Li authored
      Log these activities to help production support.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      d63cdbae
    • Long Li's avatar
      cifs: smbd: Return -EINVAL when the number of iovs exceeds SMBDIRECT_MAX_SGE · 37941ea1
      Long Li authored
      While it's not friendly to fail user processes that issue more iovs
      than we support, at least we should return the correct error code so the
      user process gets a chance to retry with smaller number of iovs.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      37941ea1
    • Long Li's avatar
      cifs: smbd: Invalidate and deregister memory registration on re-send for direct I/O · b7a55bbd
      Long Li authored
      On re-send, there might be a reconnect and all prevoius memory registrations
      need to be invalidated and deregistered.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      b7a55bbd
    • Long Li's avatar
      cifs: Don't display RDMA transport on reconnect · 14cc639c
      Long Li authored
      On reconnect, the transport data structure is NULL and its information is not
      available.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      14cc639c
    • YueHaibing's avatar
      CIFS: remove set but not used variables 'cinode' and 'netfid' · f28a2e5e
      YueHaibing authored
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      fs/cifs/file.c: In function 'cifs_flock':
      fs/cifs/file.c:1704:8: warning:
       variable 'netfid' set but not used [-Wunused-but-set-variable]
      
      fs/cifs/file.c:1702:24: warning:
       variable 'cinode' set but not used [-Wunused-but-set-variable]
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      f28a2e5e
    • Steve French's avatar
      cifs: add support for flock · d0677992
      Steve French authored
      The flock system call locks the whole file rather than a byte
      range and so is currently emulated by various other file systems
      by simply sending a byte range lock for the whole file.
      Add flock handling for cifs.ko in similar way.
      
      xfstest generic/504 passes with this as well
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      d0677992
    • YueHaibing's avatar
      cifs: remove unused variable 'sid_user' · be1bf978
      YueHaibing authored
      fs/cifs/cifsacl.c:43:30: warning:
       sid_user defined but not used [-Wunused-const-variable=]
      
      It is never used, so remove it.
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      be1bf978
    • Dan Carpenter's avatar
      cifs: rename a variable in SendReceive() · 8bd3754c
      Dan Carpenter authored
      Smatch gets confused because we sometimes refer to "server->srv_mutex" and
      sometimes to "sess->server->srv_mutex".  They refer to the same lock so
      let's just make this consistent.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      8bd3754c
    • Linus Torvalds's avatar
      Linux 5.4 · 219d5433
      Linus Torvalds authored
      219d5433