1. 03 Feb, 2020 15 commits
    • Taehee Yoo's avatar
      netdevsim: remove unused sdev code · 24531163
      Taehee Yoo authored
      sdev.c code is merged into dev.c and is not used anymore.
      it would be removed.
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      24531163
    • Taehee Yoo's avatar
      netdevsim: use __GFP_NOWARN to avoid memalloc warning · 83cf4213
      Taehee Yoo authored
      vfnum buffer size and binary_len buffer size is received by user-space.
      So, this buffer size could be too large. If so, kmalloc will internally
      print a warning message.
      This warning message is actually not necessary for the netdevsim module.
      So, this patch adds __GFP_NOWARN.
      
      Test commands:
          modprobe netdevsim
          echo 1 > /sys/bus/netdevsim/new_device
          echo 1000000000 > /sys/devices/netdevsim1/sriov_numvfs
      
      Splat looks like:
      [  357.847266][ T1000] WARNING: CPU: 0 PID: 1000 at mm/page_alloc.c:4738 __alloc_pages_nodemask+0x2f3/0x740
      [  357.850273][ T1000] Modules linked in: netdevsim veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrx
      [  357.852989][ T1000] CPU: 0 PID: 1000 Comm: bash Tainted: G    B             5.5.0-rc5+ #270
      [  357.854334][ T1000] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  357.855703][ T1000] RIP: 0010:__alloc_pages_nodemask+0x2f3/0x740
      [  357.856669][ T1000] Code: 64 fe ff ff 65 48 8b 04 25 c0 0f 02 00 48 05 f0 12 00 00 41 be 01 00 00 00 49 89 47 0
      [  357.860272][ T1000] RSP: 0018:ffff8880b7f47bd8 EFLAGS: 00010246
      [  357.861009][ T1000] RAX: ffffed1016fe8f80 RBX: 1ffff11016fe8fae RCX: 0000000000000000
      [  357.861843][ T1000] RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000
      [  357.862661][ T1000] RBP: 0000000000040dc0 R08: 1ffff11016fe8f67 R09: dffffc0000000000
      [  357.863509][ T1000] R10: ffff8880b7f47d68 R11: fffffbfff2798180 R12: 1ffff11016fe8f80
      [  357.864355][ T1000] R13: 0000000000000017 R14: 0000000000000017 R15: ffff8880c2038d68
      [  357.865178][ T1000] FS:  00007fd9a5b8c740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
      [  357.866248][ T1000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  357.867531][ T1000] CR2: 000055ce01ba8100 CR3: 00000000b7dbe005 CR4: 00000000000606f0
      [  357.868972][ T1000] Call Trace:
      [  357.869423][ T1000]  ? lock_contended+0xcd0/0xcd0
      [  357.870001][ T1000]  ? __alloc_pages_slowpath+0x21d0/0x21d0
      [  357.870673][ T1000]  ? _kstrtoull+0x76/0x160
      [  357.871148][ T1000]  ? alloc_pages_current+0xc1/0x1a0
      [  357.871704][ T1000]  kmalloc_order+0x22/0x80
      [  357.872184][ T1000]  kmalloc_order_trace+0x1d/0x140
      [  357.872733][ T1000]  __kmalloc+0x302/0x3a0
      [  357.873204][ T1000]  nsim_bus_dev_numvfs_store+0x1ab/0x260 [netdevsim]
      [  357.873919][ T1000]  ? kernfs_get_active+0x12c/0x180
      [  357.874459][ T1000]  ? new_device_store+0x450/0x450 [netdevsim]
      [  357.875111][ T1000]  ? kernfs_get_parent+0x70/0x70
      [  357.875632][ T1000]  ? sysfs_file_ops+0x160/0x160
      [  357.876152][ T1000]  kernfs_fop_write+0x276/0x410
      [  357.876680][ T1000]  ? __sb_start_write+0x1ba/0x2e0
      [  357.877225][ T1000]  vfs_write+0x197/0x4a0
      [  357.877671][ T1000]  ksys_write+0x141/0x1d0
      [ ... ]
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Fixes: 79579220 ("netdevsim: add SR-IOV functionality")
      Fixes: 82c93a87 ("netdevsim: implement couple of testing devlink health reporters")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      83cf4213
    • Taehee Yoo's avatar
      netdevsim: use IS_ERR instead of IS_ERR_OR_NULL for debugfs · 6556ff32
      Taehee Yoo authored
      Debugfs APIs return valid pointer or error pointer. it doesn't return NULL.
      So, using IS_ERR is enough, not using IS_ERR_OR_NULL.
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6556ff32
    • Taehee Yoo's avatar
      netdevsim: fix stack-out-of-bounds in nsim_dev_debugfs_init() · 6fb8852b
      Taehee Yoo authored
      When netdevsim dev is being created, a debugfs directory is created.
      The variable "dev_ddir_name" is 16bytes device name pointer and device
      name is "netdevsim<dev id>".
      The maximum dev id length is 10.
      So, 16bytes for device name isn't enough.
      
      Test commands:
          modprobe netdevsim
          echo "1000000000 0" > /sys/bus/netdevsim/new_device
      
      Splat looks like:
      [  249.622710][  T900] BUG: KASAN: stack-out-of-bounds in number+0x824/0x880
      [  249.623658][  T900] Write of size 1 at addr ffff88804c527988 by task bash/900
      [  249.624521][  T900]
      [  249.624830][  T900] CPU: 1 PID: 900 Comm: bash Not tainted 5.5.0+ #322
      [  249.625691][  T900] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  249.626712][  T900] Call Trace:
      [  249.627103][  T900]  dump_stack+0x96/0xdb
      [  249.627639][  T900]  ? number+0x824/0x880
      [  249.628173][  T900]  print_address_description.constprop.5+0x1be/0x360
      [  249.629022][  T900]  ? number+0x824/0x880
      [  249.629569][  T900]  ? number+0x824/0x880
      [  249.630105][  T900]  __kasan_report+0x12a/0x170
      [  249.630717][  T900]  ? number+0x824/0x880
      [  249.631201][  T900]  kasan_report+0xe/0x20
      [  249.631723][  T900]  number+0x824/0x880
      [  249.632235][  T900]  ? put_dec+0xa0/0xa0
      [  249.632716][  T900]  ? rcu_read_lock_sched_held+0x90/0xc0
      [  249.633392][  T900]  vsnprintf+0x63c/0x10b0
      [  249.633983][  T900]  ? pointer+0x5b0/0x5b0
      [  249.634543][  T900]  ? mark_lock+0x11d/0xc40
      [  249.635200][  T900]  sprintf+0x9b/0xd0
      [  249.635750][  T900]  ? scnprintf+0xe0/0xe0
      [  249.636370][  T900]  nsim_dev_probe+0x63c/0xbf0 [netdevsim]
      [ ... ]
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Fixes: ab1d0cc0 ("netdevsim: change debugfs tree topology")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6fb8852b
    • Taehee Yoo's avatar
      netdevsim: fix panic in nsim_dev_take_snapshot_write() · 8526ad96
      Taehee Yoo authored
      nsim_dev_take_snapshot_write() uses nsim_dev and nsim_dev->dummy_region.
      So, during this function, these data shouldn't be removed.
      But there is no protecting stuff in this function.
      
      There are two similar cases.
      1. reload case
      reload could be called during nsim_dev_take_snapshot_write().
      When reload is being executed, nsim_dev_reload_down() is called and it
      calls nsim_dev_reload_destroy(). nsim_dev_reload_destroy() calls
      devlink_region_destroy() to destroy nsim_dev->dummy_region.
      So, during nsim_dev_take_snapshot_write(), nsim_dev->dummy_region()
      would be removed.
      At this point, snapshot_write() would access freed pointer.
      In order to fix this case, take_snapshot file will be removed before
      devlink_region_destroy().
      The take_snapshot file will be re-created by ->reload_up().
      
      2. del_device_store case
      del_device_store() also could call nsim_dev_reload_destroy()
      during nsim_dev_take_snapshot_write(). If so, panic would occur.
      This problem is actually the same problem with the first case.
      So, this problem will be fixed by the first case's solution.
      
      Test commands:
          modprobe netdevsim
          while :
          do
              echo 1 > /sys/bus/netdevsim/new_device &
              echo 1 > /sys/bus/netdevsim/del_device &
      	devlink dev reload netdevsim/netdevsim1 &
      	echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/take_snapshot &
          done
      
      Splat looks like:
      [   45.564513][  T975] general protection fault, probably for non-canonical address 0xdffffc000000003a: 0000 [#1] SMP DEI
      [   45.566131][  T975] KASAN: null-ptr-deref in range [0x00000000000001d0-0x00000000000001d7]
      [   45.566135][  T975] CPU: 1 PID: 975 Comm: bash Not tainted 5.5.0+ #322
      [   45.569020][  T975] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   45.569026][  T975] RIP: 0010:__mutex_lock+0x10a/0x14b0
      [   45.570518][  T975] Code: 08 84 d2 0f 85 7f 12 00 00 44 8b 0d 10 23 65 02 45 85 c9 75 29 49 8d 7f 68 48 b8 00 00 00 0f
      [   45.570522][  T975] RSP: 0018:ffff888046ccfbf0 EFLAGS: 00010206
      [   45.572305][  T975] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [   45.572308][  T975] RDX: 000000000000003a RSI: ffffffffac926440 RDI: 00000000000001d0
      [   45.576843][  T975] RBP: ffff888046ccfd70 R08: ffffffffab610645 R09: 0000000000000000
      [   45.576847][  T975] R10: ffff888046ccfd90 R11: ffffed100d6360ad R12: 0000000000000000
      [   45.578471][  T975] R13: dffffc0000000000 R14: ffffffffae1976c0 R15: 0000000000000168
      [   45.578475][  T975] FS:  00007f614d6e7740(0000) GS:ffff88806c400000(0000) knlGS:0000000000000000
      [   45.581492][  T975] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   45.582942][  T975] CR2: 00005618677d1cf0 CR3: 000000005fb9c002 CR4: 00000000000606e0
      [   45.584543][  T975] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   45.586633][  T975] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   45.589889][  T975] Call Trace:
      [   45.591445][  T975]  ? devlink_region_snapshot_create+0x55/0x4a0
      [   45.601250][  T975]  ? mutex_lock_io_nested+0x1380/0x1380
      [   45.602817][  T975]  ? mutex_lock_io_nested+0x1380/0x1380
      [   45.603875][  T975]  ? mark_held_locks+0xa5/0xe0
      [   45.604769][  T975]  ? _raw_spin_unlock_irqrestore+0x2d/0x50
      [   45.606147][  T975]  ? __mutex_unlock_slowpath+0xd0/0x670
      [   45.607723][  T975]  ? crng_backtrack_protect+0x80/0x80
      [   45.613530][  T975]  ? wait_for_completion+0x390/0x390
      [   45.615152][  T975]  ? devlink_region_snapshot_create+0x55/0x4a0
      [   45.616834][  T975]  devlink_region_snapshot_create+0x55/0x4a0
      [ ... ]
      
      Fixes: 4418f862 ("netdevsim: implement support for devlink region and snapshots")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8526ad96
    • Taehee Yoo's avatar
      netdevsim: disable devlink reload when resources are being used · 6ab63366
      Taehee Yoo authored
      devlink reload destroys resources and allocates resources again.
      So, when devices and ports resources are being used, devlink reload
      function should not be executed. In order to avoid this race, a new
      lock is added and new_port() and del_port() call devlink_reload_disable()
      and devlink_reload_enable().
      
      Thread0                      Thread1
      {new/del}_port()             {new/del}_port()
      devlink_reload_disable()
                                   devlink_reload_disable()
      devlink_reload_enable()
                                   //here
                                   devlink_reload_enable()
      
      Before Thread1's devlink_reload_enable(), the devlink is already allowed
      to execute reload because Thread0 allows it. devlink reload disable/enable
      variable type is bool. So the above case would exist.
      So, disable/enable should be executed atomically.
      In order to do that, a new lock is used.
      
      Test commands:
          modprobe netdevsim
          echo 1 > /sys/bus/netdevsim/new_device
          while :
          do
              echo 1 > /sys/devices/netdevsim1/new_port &
              echo 1 > /sys/devices/netdevsim1/del_port &
              devlink dev reload netdevsim/netdevsim1 &
          done
      
      Splat looks like:
      [   23.342145][  T932] DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock))
      [   23.342159][  T932] WARNING: CPU: 0 PID: 932 at kernel/locking/mutex-debug.c:103 mutex_destroy+0xc7/0xf0
      [   23.344182][  T932] Modules linked in: netdevsim openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_dx
      [   23.346485][  T932] CPU: 0 PID: 932 Comm: devlink Not tainted 5.5.0+ #322
      [   23.347696][  T932] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   23.348893][  T932] RIP: 0010:mutex_destroy+0xc7/0xf0
      [   23.349505][  T932] Code: e0 07 83 c0 03 38 d0 7c 04 84 d2 75 2e 8b 05 00 ac b0 02 85 c0 75 8b 48 c7 c6 00 5e 07 96 40
      [   23.351887][  T932] RSP: 0018:ffff88806208f810 EFLAGS: 00010286
      [   23.353963][  T932] RAX: dffffc0000000008 RBX: ffff888067f6f2c0 RCX: ffffffff942c4bd4
      [   23.355222][  T932] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff96dac5b4
      [   23.356169][  T932] RBP: ffff888067f6f000 R08: fffffbfff2d235a5 R09: fffffbfff2d235a5
      [   23.357160][  T932] R10: 0000000000000001 R11: fffffbfff2d235a4 R12: ffff888067f6f208
      [   23.358288][  T932] R13: ffff88806208fa70 R14: ffff888067f6f000 R15: ffff888069ce3800
      [   23.359307][  T932] FS:  00007fe2a3876740(0000) GS:ffff88806c000000(0000) knlGS:0000000000000000
      [   23.360473][  T932] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   23.361319][  T932] CR2: 00005561357aa000 CR3: 000000005227a006 CR4: 00000000000606f0
      [   23.362323][  T932] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   23.363417][  T932] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   23.364414][  T932] Call Trace:
      [   23.364828][  T932]  nsim_dev_reload_destroy+0x77/0xb0 [netdevsim]
      [   23.365655][  T932]  nsim_dev_reload_down+0x84/0xb0 [netdevsim]
      [   23.366433][  T932]  devlink_reload+0xb1/0x350
      [   23.367010][  T932]  genl_rcv_msg+0x580/0xe90
      
      [ ...]
      
      [   23.531729][ T1305] kernel BUG at lib/list_debug.c:53!
      [   23.532523][ T1305] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [   23.533467][ T1305] CPU: 2 PID: 1305 Comm: bash Tainted: G        W         5.5.0+ #322
      [   23.534962][ T1305] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   23.536503][ T1305] RIP: 0010:__list_del_entry_valid+0xe6/0x150
      [   23.538346][ T1305] Code: 89 ea 48 c7 c7 00 73 1e 96 e8 df f7 4c ff 0f 0b 48 c7 c7 60 73 1e 96 e8 d1 f7 4c ff 0f 0b 44
      [   23.541068][ T1305] RSP: 0018:ffff888047c27b58 EFLAGS: 00010282
      [   23.542001][ T1305] RAX: 0000000000000054 RBX: ffff888067f6f318 RCX: 0000000000000000
      [   23.543051][ T1305] RDX: 0000000000000054 RSI: 0000000000000008 RDI: ffffed1008f84f61
      [   23.544072][ T1305] RBP: ffff88804aa0fca0 R08: ffffed100d940539 R09: ffffed100d940539
      [   23.545085][ T1305] R10: 0000000000000001 R11: ffffed100d940538 R12: ffff888047c27cb0
      [   23.546422][ T1305] R13: ffff88806208b840 R14: ffffffff981976c0 R15: ffff888067f6f2c0
      [   23.547406][ T1305] FS:  00007f76c0431740(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
      [   23.548527][ T1305] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   23.549389][ T1305] CR2: 00007f5048f1a2f8 CR3: 000000004b310006 CR4: 00000000000606e0
      [   23.550636][ T1305] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   23.551578][ T1305] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   23.552597][ T1305] Call Trace:
      [   23.553004][ T1305]  mutex_remove_waiter+0x101/0x520
      [   23.553646][ T1305]  __mutex_lock+0xac7/0x14b0
      [   23.554218][ T1305]  ? nsim_dev_port_del+0x4e/0x140 [netdevsim]
      [   23.554908][ T1305]  ? mutex_lock_io_nested+0x1380/0x1380
      [   23.555570][ T1305]  ? _parse_integer+0xf0/0xf0
      [   23.556043][ T1305]  ? kstrtouint+0x86/0x110
      [   23.556504][ T1305]  ? nsim_dev_port_del+0x4e/0x140 [netdevsim]
      [   23.557133][ T1305]  nsim_dev_port_del+0x4e/0x140 [netdevsim]
      [   23.558024][ T1305]  del_port_store+0xcc/0xf0 [netdevsim]
      [ ... ]
      
      Fixes: 75ba029f ("netdevsim: implement proper devlink reload")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6ab63366
    • Taehee Yoo's avatar
      netdevsim: fix using uninitialized resources · f5cd2160
      Taehee Yoo authored
      When module is being initialized, __init() calls bus_register() and
      driver_register().
      These functions internally create various resources and sysfs files.
      The sysfs files are used for basic operations(add/del device).
      /sys/bus/netdevsim/new_device
      /sys/bus/netdevsim/del_device
      
      These sysfs files use netdevsim resources, they are mostly allocated
      and initialized in ->probe() function, which is nsim_dev_probe().
      But, sysfs files could be executed before ->probe() is finished.
      So, accessing uninitialized data would occur.
      
      Another problem is very similar.
      /sys/bus/netdevsim/new_device internally creates sysfs files.
      /sys/devices/netdevsim<id>/new_port
      /sys/devices/netdevsim<id>/del_port
      
      These sysfs files also use netdevsim resources, they are mostly allocated
      and initialized in creating device routine, which is nsim_bus_dev_new().
      But they also could be executed before nsim_bus_dev_new() is finished.
      So, accessing uninitialized data would occur.
      
      To fix these problems, this patch adds flags, which means whether the
      operation is finished or not.
      The flag variable 'nsim_bus_enable' means whether netdevsim bus was
      initialized or not.
      This is protected by nsim_bus_dev_list_lock.
      The flag variable 'nsim_bus_dev->init' means whether nsim_bus_dev was
      initialized or not.
      This could be used in {new/del}_port_store() with no lock.
      
      Test commands:
          #SHELL1
          modprobe netdevsim
          while :
          do
              echo "1 1" > /sys/bus/netdevsim/new_device
              echo "1 1" > /sys/bus/netdevsim/del_device
          done
      
          #SHELL2
          while :
          do
              echo 1 > /sys/devices/netdevsim1/new_port
              echo 1 > /sys/devices/netdevsim1/del_port
          done
      
      Splat looks like:
      [   47.508954][ T1008] general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 I
      [   47.510793][ T1008] KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
      [   47.511963][ T1008] CPU: 2 PID: 1008 Comm: bash Not tainted 5.5.0+ #322
      [   47.512823][ T1008] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   47.514041][ T1008] RIP: 0010:__mutex_lock+0x10a/0x14b0
      [   47.514699][ T1008] Code: 08 84 d2 0f 85 7f 12 00 00 44 8b 0d 10 23 65 02 45 85 c9 75 29 49 8d 7f 68 48 b8 00 00 00 0f
      [   47.517163][ T1008] RSP: 0018:ffff888059b4fbb0 EFLAGS: 00010206
      [   47.517802][ T1008] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [   47.518941][ T1008] RDX: 0000000000000021 RSI: ffffffff85926440 RDI: 0000000000000108
      [   47.519732][ T1008] RBP: ffff888059b4fd30 R08: ffffffffc073fad0 R09: 0000000000000000
      [   47.520729][ T1008] R10: ffff888059b4fd50 R11: ffff88804bb38040 R12: 0000000000000000
      [   47.521702][ T1008] R13: dffffc0000000000 R14: ffffffff871976c0 R15: 00000000000000a0
      [   47.522760][ T1008] FS:  00007fd4be05a740(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
      [   47.523877][ T1008] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   47.524627][ T1008] CR2: 0000561c82b69cf0 CR3: 0000000065dd6004 CR4: 00000000000606e0
      [   47.527662][ T1008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   47.528604][ T1008] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   47.529531][ T1008] Call Trace:
      [   47.529874][ T1008]  ? nsim_dev_port_add+0x50/0x150 [netdevsim]
      [   47.530470][ T1008]  ? mutex_lock_io_nested+0x1380/0x1380
      [   47.531018][ T1008]  ? _kstrtoull+0x76/0x160
      [   47.531449][ T1008]  ? _parse_integer+0xf0/0xf0
      [   47.531874][ T1008]  ? kernfs_fop_write+0x1cf/0x410
      [   47.532330][ T1008]  ? sysfs_file_ops+0x160/0x160
      [   47.532773][ T1008]  ? kstrtouint+0x86/0x110
      [   47.533168][ T1008]  ? nsim_dev_port_add+0x50/0x150 [netdevsim]
      [   47.533721][ T1008]  nsim_dev_port_add+0x50/0x150 [netdevsim]
      [   47.534336][ T1008]  ? sysfs_file_ops+0x160/0x160
      [   47.534858][ T1008]  new_port_store+0x99/0xb0 [netdevsim]
      [   47.535439][ T1008]  ? del_port_store+0xb0/0xb0 [netdevsim]
      [   47.536035][ T1008]  ? sysfs_file_ops+0x112/0x160
      [   47.536544][ T1008]  ? sysfs_kf_write+0x3b/0x180
      [   47.537029][ T1008]  kernfs_fop_write+0x276/0x410
      [   47.537548][ T1008]  ? __sb_start_write+0x215/0x2e0
      [   47.538110][ T1008]  vfs_write+0x197/0x4a0
      [ ... ]
      
      Fixes: f9d9db47 ("netdevsim: add bus attributes to add new and delete devices")
      Fixes: 794b2c05 ("netdevsim: extend device attrs to support port addition and deletion")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f5cd2160
    • Jakub Kicinski's avatar
      Merge branch 'bnxt_en-Bug-fixes' · 2b5ea294
      Jakub Kicinski authored
      Michael Chan says:
      
      =====================
      bnxt_en: Bug fixes
      
      3 patches that fix some issues in the firmware reset logic, starting
      with a small patch to refactor the code that re-enables SRIOV.  The
      last patch fixes a TC queue mapping issue.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2b5ea294
    • Michael Chan's avatar
      bnxt_en: Fix TC queue mapping. · 18e4960c
      Michael Chan authored
      The driver currently only calls netdev_set_tc_queue when the number of
      TCs is greater than 1.  Instead, the comparison should be greater than
      or equal to 1.  Even with 1 TC, we need to set the queue mapping.
      
      This bug can cause warnings when the number of TCs is changed back to 1.
      
      Fixes: 7809592d ("bnxt_en: Enable MSIX early in bnxt_init_one().")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      18e4960c
    • Vasundhara Volam's avatar
      bnxt_en: Fix logic that disables Bus Master during firmware reset. · d4073028
      Vasundhara Volam authored
      The current logic that calls pci_disable_device() in __bnxt_close_nic()
      during firmware reset is flawed.  If firmware is still alive, we're
      disabling the device too early, causing some firmware commands to
      not reach the firmware.
      
      Fix it by moving the logic to bnxt_reset_close().  If firmware is
      in fatal condition, we call pci_disable_device() before we free
      any of the rings to prevent DMA corruption of the freed rings.  If
      firmware is still alive, we call pci_disable_device() after the
      last firmware message has been sent.
      
      Fixes: 3bc7d4a3 ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d4073028
    • Michael Chan's avatar
      bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset. · 12de2ead
      Michael Chan authored
      bnxt_ulp_start() needs to be called before SRIOV is re-enabled after
      firmware reset.  Re-enabling SRIOV may consume all the resources and
      may cause the RDMA driver to fail to get MSIX and other resources.
      Fix it by calling bnxt_ulp_start() first before calling
      bnxt_reenable_sriov().
      
      We re-arrange the logic so that we call bnxt_ulp_start() and
      bnxt_reenable_sriov() in proper sequence in bnxt_fw_reset_task() and
      bnxt_open().  The former is the normal coordinated firmware reset sequence
      and the latter is firmware reset while the function is down.  This new
      logic is now more straight forward and will now fix both scenarios.
      
      Fixes: f3a6d206 ("bnxt_en: Call bnxt_ulp_stop()/bnxt_ulp_start() during error recovery.")
      Reported-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      12de2ead
    • Michael Chan's avatar
      bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected. · c16d4ee0
      Michael Chan authored
      Put the current logic in bnxt_open() to re-enable SRIOV after detecting
      firmware reset into a new function bnxt_reenable_sriov().  This call
      needs to be invoked in the firmware reset path also in the next patch.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c16d4ee0
    • Nicolin Chen's avatar
      net: stmmac: Delete txtimer in suspend() · 14b41a29
      Nicolin Chen authored
      When running v5.5 with a rootfs on NFS, memory abort may happen in
      the system resume stage:
       Unable to handle kernel paging request at virtual address dead00000000012a
       [dead00000000012a] address between user and kernel address ranges
       pc : run_timer_softirq+0x334/0x3d8
       lr : run_timer_softirq+0x244/0x3d8
       x1 : ffff800011cafe80 x0 : dead000000000122
       Call trace:
        run_timer_softirq+0x334/0x3d8
        efi_header_end+0x114/0x234
        irq_exit+0xd0/0xd8
        __handle_domain_irq+0x60/0xb0
        gic_handle_irq+0x58/0xa8
        el1_irq+0xb8/0x180
        arch_cpu_idle+0x10/0x18
        do_idle+0x1d8/0x2b0
        cpu_startup_entry+0x24/0x40
        secondary_start_kernel+0x1b4/0x208
       Code: f9000693 a9400660 f9000020 b4000040 (f9000401)
       ---[ end trace bb83ceeb4c482071 ]---
       Kernel panic - not syncing: Fatal exception in interrupt
       SMP: stopping secondary CPUs
       SMP: failed to stop secondary CPUs 2-3
       Kernel Offset: disabled
       CPU features: 0x00002,2300aa30
       Memory Limit: none
       ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      It's found that stmmac_xmit() and stmmac_resume() sometimes might
      run concurrently, possibly resulting in a race condition between
      mod_timer() and setup_timer(), being called by stmmac_xmit() and
      stmmac_resume() respectively.
      
      Since the resume() runs setup_timer() every time, it'd be safer to
      have del_timer_sync() in the suspend() as the counterpart.
      Signed-off-by: default avatarNicolin Chen <nicoleotsuka@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      14b41a29
    • Jakub Kicinski's avatar
      Merge tag 'rxrpc-fixes-20200203' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 3d80c653
      Jakub Kicinski authored
      David Howells says:
      
      ====================
      RxRPC fixes
      
      Here are a number of fixes for AF_RXRPC:
      
       (1) Fix a potential use after free in rxrpc_put_local() where it was
           accessing the object just put to get tracing information.
      
       (2) Fix insufficient notifications being generated by the function that
           queues data packets on a call.  This occasionally causes recvmsg() to
           stall indefinitely.
      
       (3) Fix a number of packet-transmitting work functions to hold an active
           count on the local endpoint so that the UDP socket doesn't get
           destroyed whilst they're calling kernel_sendmsg() on it.
      
       (4) Fix a NULL pointer deref that stemmed from a call's connection pointer
           being cleared when the call was disconnected.
      
      Changes:
      
       v2: Removed a couple of BUG() statements that got added.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3d80c653
    • David Howells's avatar
      rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect · 5273a191
      David Howells authored
      When a call is disconnected, the connection pointer from the call is
      cleared to make sure it isn't used again and to prevent further attempted
      transmission for the call.  Unfortunately, there might be a daemon trying
      to use it at the same time to transmit a packet.
      
      Fix this by keeping call->conn set, but setting a flag on the call to
      indicate disconnection instead.
      
      Remove also the bits in the transmission functions where the conn pointer is
      checked and a ref taken under spinlock as this is now redundant.
      
      Fixes: 8d94aa38 ("rxrpc: Calls shouldn't hold socket refs")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5273a191
  2. 02 Feb, 2020 4 commits
    • Jakub Kicinski's avatar
      Merge branch 'Fix-reconnection-latency-caused-by-FIN-ACK-handling-race' · 83d0585f
      Jakub Kicinski authored
      SeongJae Park says:
      
      ====================
      Fix reconnection latency caused by FIN/ACK handling race
      
      The first patch fixes the problem by adjusting the first resend delay of
      the SYN in the case.  The second one adds a user space test to reproduce
      this problem.
      
      From v2
      (https://lore.kernel.org/linux-kselftest/20200201071859.4231-1-sj38.park@gmail.com/)
       - Use TCP_TIMEOUT_MIN as reduced delay (Neal Cardwall)
       - Add Reviewed-by and Signed-off-by from Eric Dumazet
      
      From v1
      (https://lore.kernel.org/linux-kselftest/20200131122421.23286-1-sjpark@amazon.com/)
       - Drop the trivial comment fix patch (Eric Dumazet)
       - Limit the delay adjustment to only the first SYN resend (Eric Dumazet)
       - selftest: Avoid use of hard-coded port number (Eric Dumazet)
       - Explain RST/ACK and FIN/ACK has no big difference (Neal Cardwell)
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      83d0585f
    • SeongJae Park's avatar
      selftests: net: Add FIN_ACK processing order related latency spike test · af8c8a45
      SeongJae Park authored
      This commit adds a test for FIN_ACK process races related reconnection
      latency spike issues.  The issue has described and solved by the
      previous commit ("tcp: Reduce SYN resend delay if a suspicous ACK is
      received").
      
      The test program is configured with a server and a client process.  The
      server creates and binds a socket to a port that dynamically allocated,
      listen on it, and start a infinite loop.  Inside the loop, it accepts
      connection, reads 4 bytes from the socket, and closes the connection.
      The client is constructed as an infinite loop.  Inside the loop, it
      creates a socket with LINGER and NODELAY option, connect to the server,
      send 4 bytes data, try read some data from server.  After the read()
      returns, it measure the latency from the beginning of this loop to this
      point and if the latency is larger than 1 second (spike), print a
      message.
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSeongJae Park <sjpark@amazon.de>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      af8c8a45
    • SeongJae Park's avatar
      tcp: Reduce SYN resend delay if a suspicous ACK is received · 9603d47b
      SeongJae Park authored
      When closing a connection, the two acks that required to change closing
      socket's status to FIN_WAIT_2 and then TIME_WAIT could be processed in
      reverse order.  This is possible in RSS disabled environments such as a
      connection inside a host.
      
      For example, expected state transitions and required packets for the
      disconnection will be similar to below flow.
      
      	 00 (Process A)				(Process B)
      	 01 ESTABLISHED				ESTABLISHED
      	 02 close()
      	 03 FIN_WAIT_1
      	 04 		---FIN-->
      	 05 					CLOSE_WAIT
      	 06 		<--ACK---
      	 07 FIN_WAIT_2
      	 08 		<--FIN/ACK---
      	 09 TIME_WAIT
      	 10 		---ACK-->
      	 11 					LAST_ACK
      	 12 CLOSED				CLOSED
      
      In some cases such as LINGER option applied socket, the FIN and FIN/ACK
      will be substituted to RST and RST/ACK, but there is no difference in
      the main logic.
      
      The acks in lines 6 and 8 are the acks.  If the line 8 packet is
      processed before the line 6 packet, it will be just ignored as it is not
      a expected packet, and the later process of the line 6 packet will
      change the status of Process A to FIN_WAIT_2, but as it has already
      handled line 8 packet, it will not go to TIME_WAIT and thus will not
      send the line 10 packet to Process B.  Thus, Process B will left in
      CLOSE_WAIT status, as below.
      
      	 00 (Process A)				(Process B)
      	 01 ESTABLISHED				ESTABLISHED
      	 02 close()
      	 03 FIN_WAIT_1
      	 04 		---FIN-->
      	 05 					CLOSE_WAIT
      	 06 				(<--ACK---)
      	 07	  			(<--FIN/ACK---)
      	 08 				(fired in right order)
      	 09 		<--FIN/ACK---
      	 10 		<--ACK---
      	 11 		(processed in reverse order)
      	 12 FIN_WAIT_2
      
      Later, if the Process B sends SYN to Process A for reconnection using
      the same port, Process A will responds with an ACK for the last flow,
      which has no increased sequence number.  Thus, Process A will send RST,
      wait for TIMEOUT_INIT (one second in default), and then try
      reconnection.  If reconnections are frequent, the one second latency
      spikes can be a big problem.  Below is a tcpdump results of the problem:
      
          14.436259 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644
          14.436266 IP 127.0.0.1.4242 > 127.0.0.1.45150: Flags [.], ack 5, win 512
          14.436271 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [R], seq 2541101298
          /* ONE SECOND DELAY */
          15.464613 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644
      
      This commit mitigates the problem by reducing the delay for the next SYN
      if the suspicous ACK is received while in SYN_SENT state.
      
      Following commit will add a selftest, which can be also helpful for
      understanding of this issue.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSeongJae Park <sjpark@amazon.de>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9603d47b
    • Lukas Bulwahn's avatar
      MAINTAINERS: correct entries for ISDN/mISDN section · dff6bc1b
      Lukas Bulwahn authored
      Commit 6d979850 ("isdn: move capi drivers to staging") cleaned up the
      isdn drivers and split the MAINTAINERS section for ISDN, but missed to add
      the terminal slash for the two directories mISDN and hardware. Hence, all
      files in those directories were not part of the new ISDN/mISDN SUBSYSTEM,
      but were considered to be part of "THE REST".
      
      Rectify the situation, and while at it, also complete the section with two
      further build files that belong to that subsystem.
      
      This was identified with a small script that finds all files belonging to
      "THE REST" according to the current MAINTAINERS file, and I investigated
      upon its output.
      
      Fixes: 6d979850 ("isdn: move capi drivers to staging")
      Signed-off-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dff6bc1b
  3. 01 Feb, 2020 9 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · b7c3a17c
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Fix suspicious RCU usage in ipset, from Jozsef Kadlecsik.
      
      2) Use kvcalloc, from Joe Perches.
      
      3) Flush flowtable hardware workqueue after garbage collection run,
         from Paul Blakey.
      
      4) Missing flowtable hardware workqueue flush from nf_flow_table_free(),
         also from Paul.
      
      5) Restore NF_FLOW_HW_DEAD in flow_offload_work_del(), from Paul.
      
      6) Flowtable documentation fixes, from Matteo Croce.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b7c3a17c
    • Eric Dumazet's avatar
      cls_rsvp: fix rsvp_policy · cb3c0e6b
      Eric Dumazet authored
      NLA_BINARY can be confusing, since .len value represents
      the max size of the blob.
      
      cls_rsvp really wants user space to provide long enough data
      for TCA_RSVP_DST and TCA_RSVP_SRC attributes.
      
      BUG: KMSAN: uninit-value in rsvp_get net/sched/cls_rsvp.h:258 [inline]
      BUG: KMSAN: uninit-value in gen_handle net/sched/cls_rsvp.h:402 [inline]
      BUG: KMSAN: uninit-value in rsvp_change+0x1ae9/0x4220 net/sched/cls_rsvp.h:572
      CPU: 1 PID: 13228 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       rsvp_get net/sched/cls_rsvp.h:258 [inline]
       gen_handle net/sched/cls_rsvp.h:402 [inline]
       rsvp_change+0x1ae9/0x4220 net/sched/cls_rsvp.h:572
       tc_new_tfilter+0x31fe/0x5010 net/sched/cls_api.c:2104
       rtnetlink_rcv_msg+0xcb7/0x1570 net/core/rtnetlink.c:5415
       netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45b349
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f269d43dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f269d43e6d4 RCX: 000000000045b349
      RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003
      RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000009c2 R14: 00000000004cb338 R15: 000000000075bfd4
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
       slab_alloc_node mm/slub.c:2774 [inline]
       __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4382
       __kmalloc_reserve net/core/skbuff.c:141 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209
       alloc_skb include/linux/skbuff.h:1049 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline]
       netlink_sendmsg+0x7d3/0x14d0 net/netlink/af_netlink.c:1892
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 6fa8c014 ("[NET_SCHED]: Use nla_policy for attribute validation in classifiers")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cb3c0e6b
    • Sven Eckelmann's avatar
      MAINTAINERS: Orphan HSR network protocol · e8d5bb4d
      Sven Eckelmann authored
      The current maintainer Arvid Brodin <arvid.brodin@alten.se> hasn't
      contributed to the kernel since 2015-02-27. His company mail address is
      also bouncing and the company confirmed (2020-01-31) that no Arvid Brodin
      is working for them:
      
      > Vi har dessvärre ingen  Arvid Brodin som arbetar på ALTEN.
      
      A MIA person cannot be the maintainer. It is better to mark is as orphaned
      until some other person can jump in and take over the responsibility for
      HSR.
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e8d5bb4d
    • Dan Carpenter's avatar
      qed: Fix a error code in qed_hw_init() · d32a06f5
      Dan Carpenter authored
      If the qed_fw_overlay_mem_alloc() then we should return -ENOMEM instead
      of success.
      
      Fixes: 30d5f858 ("qed: FW 8.42.2.0 Add fw overlay feature")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d32a06f5
    • Dan Carpenter's avatar
      octeontx2-pf: Fix an IS_ERR() vs NULL bug · 08ff7818
      Dan Carpenter authored
      The otx2_mbox_get_rsp() function never returns NULL, it returns error
      pointers on error.
      
      Fixes: 34bfe0eb ("octeontx2-pf: MTU, MAC and RX mode config support")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      08ff7818
    • Eric Dumazet's avatar
      tcp: clear tp->segs_{in|out} in tcp_disconnect() · 784f8344
      Eric Dumazet authored
      tp->segs_in and tp->segs_out need to be cleared in tcp_disconnect().
      
      tcp_disconnect() is rarely used, but it is worth fixing it.
      
      Fixes: 2efd055c ("tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Marcelo Ricardo Leitner <mleitner@redhat.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      784f8344
    • Eric Dumazet's avatar
      tcp: clear tp->data_segs{in|out} in tcp_disconnect() · db7ffee6
      Eric Dumazet authored
      tp->data_segs_in and tp->data_segs_out need to be cleared
      in tcp_disconnect().
      
      tcp_disconnect() is rarely used, but it is worth fixing it.
      
      Fixes: a44d6eac ("tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      db7ffee6
    • Eric Dumazet's avatar
      tcp: clear tp->delivered in tcp_disconnect() · 2fbdd562
      Eric Dumazet authored
      tp->delivered needs to be cleared in tcp_disconnect().
      
      tcp_disconnect() is rarely used, but it is worth fixing it.
      
      Fixes: ddf1af6f ("tcp: new delivery accounting")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2fbdd562
    • Eric Dumazet's avatar
      tcp: clear tp->total_retrans in tcp_disconnect() · c13c48c0
      Eric Dumazet authored
      total_retrans needs to be cleared in tcp_disconnect().
      
      tcp_disconnect() is rarely used, but it is worth fixing it.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: SeongJae Park <sjpark@amazon.de>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c13c48c0
  4. 31 Jan, 2020 10 commits
  5. 30 Jan, 2020 2 commits
    • David Howells's avatar
      rxrpc: Fix missing active use pinning of rxrpc_local object · 04d36d74
      David Howells authored
      The introduction of a split between the reference count on rxrpc_local
      objects and the usage count didn't quite go far enough.  A number of kernel
      work items need to make use of the socket to perform transmission.  These
      also need to get an active count on the local object to prevent the socket
      from being closed.
      
      Fix this by getting the active count in those places.
      
      Also split out the raw active count get/put functions as these places tend
      to hold refs on the rxrpc_local object already, so getting and putting an
      extra object ref is just a waste of time.
      
      The problem can lead to symptoms like:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000018
          ..
          CPU: 2 PID: 818 Comm: kworker/u9:0 Not tainted 5.5.0-fscache+ #51
          ...
          RIP: 0010:selinux_socket_sendmsg+0x5/0x13
          ...
          Call Trace:
           security_socket_sendmsg+0x2c/0x3e
           sock_sendmsg+0x1a/0x46
           rxrpc_send_keepalive+0x131/0x1ae
           rxrpc_peer_keepalive_worker+0x219/0x34b
           process_one_work+0x18e/0x271
           worker_thread+0x1a3/0x247
           kthread+0xe6/0xeb
           ret_from_fork+0x1f/0x30
      
      Fixes: 730c5fd4 ("rxrpc: Fix local endpoint refcounting")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      04d36d74
    • David Howells's avatar
      rxrpc: Fix insufficient receive notification generation · f71dbf2f
      David Howells authored
      In rxrpc_input_data(), rxrpc_notify_socket() is called if the base sequence
      number of the packet is immediately following the hard-ack point at the end
      of the function.  However, this isn't sufficient, since the recvmsg side
      may have been advancing the window and then overrun the position in which
      we're adding - at which point rx_hard_ack >= seq0 and no notification is
      generated.
      
      Fix this by always generating a notification at the end of the input
      function.
      
      Without this, a long call may stall, possibly indefinitely.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f71dbf2f