1. 22 Nov, 2022 7 commits
    • Jacob Keller's avatar
      ice: fix handling of burst Tx timestamps · 30f15874
      Jacob Keller authored
      Commit 1229b339 ("ice: Add low latency Tx timestamp read") refactored
      PTP timestamping logic to use a threaded IRQ instead of a separate kthread.
      
      This implementation introduced ice_misc_intr_thread_fn and redefined the
      ice_ptp_process_ts function interface to return a value of whether or not
      the timestamp processing was complete.
      
      ice_misc_intr_thread_fn would take the return value from ice_ptp_process_ts
      and convert it into either IRQ_HANDLED if there were no more timestamps to
      be processed, or IRQ_WAKE_THREAD if the thread should continue processing.
      
      This is not correct, as the kernel does not re-schedule threaded IRQ
      functions automatically. IRQ_WAKE_THREAD can only be used by the main IRQ
      function.
      
      This results in the ice_ptp_process_ts function (and in turn the
      ice_ptp_tx_tstamp function) from only being called exactly once per
      interrupt.
      
      If an application sends a burst of Tx timestamps without waiting for a
      response, the interrupt will trigger for the first timestamp. However,
      later timestamps may not have arrived yet. This can result in dropped or
      discarded timestamps. Worse, on E822 hardware this results in the interrupt
      logic getting stuck such that no future interrupts will be triggered. The
      result is complete loss of Tx timestamp functionality.
      
      Fix this by modifying the ice_misc_intr_thread_fn to perform its own
      polling of the ice_ptp_process_ts function. We sleep for a few microseconds
      between attempts to avoid wasting significant CPU time. The value was
      chosen to allow time for the Tx timestamps to complete without wasting so
      much time that we overrun application wait budgets in the worst case.
      
      The ice_ptp_process_ts function also currently returns false in the event
      that the Tx tracker is not initialized. This would result in the threaded
      IRQ handler never exiting if it gets started while the tracker is not
      initialized.
      
      Fix the function to appropriately return true when the tracker is not
      initialized.
      
      Note that this will not reproduce with default ptp4l behavior, as the
      program always synchronously waits for a timestamp response before sending
      another timestamp request.
      Reported-by: default avatarSiddaraju DH <siddaraju.dh@intel.com>
      Fixes: 1229b339 ("ice: Add low latency Tx timestamp read")
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20221118222729.1565317-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30f15874
    • YueHaibing's avatar
      tipc: check skb_linearize() return value in tipc_disc_rcv() · cd0f6421
      YueHaibing authored
      If skb_linearize() fails in tipc_disc_rcv(), we need to free the skb instead of
      handle it.
      
      Fixes: 25b0b9c4 ("tipc: handle collisions of 32-bit node address hash values")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Link: https://lore.kernel.org/r/20221119072832.7896-1-yuehaibing@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cd0f6421
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 5916380c
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-11-18 (iavf)
      
      Ivan Vecera resolves issues related to reset by adding back call to
      netif_tx_stop_all_queues() and adding calls to dev_close() to ensure
      device is properly closed during reset.
      
      Stefan Assmann removes waiting for setting of MAC address as this breaks
      ARP.
      
      Slawomir adds setting of __IAVF_IN_REMOVE_TASK bit to prevent deadlock
      between remove and shutdown.
      
      * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        iavf: Fix race condition between iavf_shutdown and iavf_remove
        iavf: remove INITIAL_MAC_SET to allow gARP to work properly
        iavf: Do not restart Tx queues after reset task failure
        iavf: Fix a crash during reset task
      ====================
      
      Link: https://lore.kernel.org/r/20221118222439.1565245-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5916380c
    • Jakub Kicinski's avatar
      Merge branch 'tipc-fix-two-race-issues-in-tipc_conn_alloc' · 3349c272
      Jakub Kicinski authored
      Xin Long says:
      
      ====================
      tipc: fix two race issues in tipc_conn_alloc
      
      The race exists beteen tipc_topsrv_accept() and tipc_conn_close(),
      one is allocating the con while the other is freeing it and there
      is no proper lock protecting it. Therefore, a null-pointer-defer
      and a use-after-free may be triggered, see details on each patch.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1668807842.git.lucien.xin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3349c272
    • Xin Long's avatar
      tipc: add an extra conn_get in tipc_conn_alloc · a7b42969
      Xin Long authored
      One extra conn_get() is needed in tipc_conn_alloc(), as after
      tipc_conn_alloc() is called, tipc_conn_close() may free this
      con before deferencing it in tipc_topsrv_accept():
      
         tipc_conn_alloc();
         newsk = newsock->sk;
                                       <---- tipc_conn_close();
         write_lock_bh(&sk->sk_callback_lock);
         newsk->sk_data_ready = tipc_conn_data_ready;
      
      Then an uaf issue can be triggered:
      
        BUG: KASAN: use-after-free in tipc_topsrv_accept+0x1e7/0x370 [tipc]
        Call Trace:
         <TASK>
         dump_stack_lvl+0x33/0x46
         print_report+0x178/0x4b0
         kasan_report+0x8c/0x100
         kasan_check_range+0x179/0x1e0
         tipc_topsrv_accept+0x1e7/0x370 [tipc]
         process_one_work+0x6a3/0x1030
         worker_thread+0x8a/0xdf0
      
      This patch fixes it by holding it in tipc_conn_alloc(), then after
      all accessing in tipc_topsrv_accept() releasing it. Note when does
      this in tipc_topsrv_kern_subscr(), as tipc_conn_rcv_sub() returns
      0 or -1 only, we don't need to check for "> 0".
      
      Fixes: c5fa7b3c ("tipc: introduce new TIPC server infrastructure")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a7b42969
    • Xin Long's avatar
      tipc: set con sock in tipc_conn_alloc · 0e5d56c6
      Xin Long authored
      A crash was reported by Wei Chen:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000018
        RIP: 0010:tipc_conn_close+0x12/0x100
        Call Trace:
         tipc_topsrv_exit_net+0x139/0x320
         ops_exit_list.isra.9+0x49/0x80
         cleanup_net+0x31a/0x540
         process_one_work+0x3fa/0x9f0
         worker_thread+0x42/0x5c0
      
      It was caused by !con->sock in tipc_conn_close(). In tipc_topsrv_accept(),
      con is allocated in conn_idr then its sock is set:
      
        con = tipc_conn_alloc();
        ...                    <----[1]
        con->sock = newsock;
      
      If tipc_conn_close() is called in anytime of [1], the null-pointer-def
      is triggered by con->sock->sk due to con->sock is not yet set.
      
      This patch fixes it by moving the con->sock setting to tipc_conn_alloc()
      under s->idr_lock. So that con->sock can never be NULL when getting the
      con from s->conn_idr. It will be also safer to move con->server and flag
      CF_CONNECTED setting under s->idr_lock, as they should all be set before
      tipc_conn_alloc() is called.
      
      Fixes: c5fa7b3c ("tipc: introduce new TIPC server infrastructure")
      Reported-by: default avatarWei Chen <harperchen1110@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0e5d56c6
    • Wei Yongjun's avatar
      net: phy: at803x: fix error return code in at803x_probe() · 1f0dd412
      Wei Yongjun authored
      Fix to return a negative error code from the ccr read error handling
      case instead of 0, as done elsewhere in this function.
      
      Fixes: 3265f421 ("net: phy: at803x: add fiber support")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20221118103635.254256-1-weiyongjun@huaweicloud.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1f0dd412
  2. 21 Nov, 2022 9 commits
  3. 19 Nov, 2022 11 commits
  4. 18 Nov, 2022 13 commits
    • Slawomir Laba's avatar
      iavf: Fix race condition between iavf_shutdown and iavf_remove · a8417330
      Slawomir Laba authored
      Fix a deadlock introduced by commit
      97457801 ("iavf: Add waiting so the port is initialized in remove")
      due to race condition between iavf_shutdown and iavf_remove, where
      iavf_remove stucks forever in while loop since iavf_shutdown already
      set __IAVF_REMOVE adapter state.
      
      Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been
      set and return if so.
      
      Fixes: 97457801 ("iavf: Add waiting so the port is initialized in remove")
      Signed-off-by: default avatarSlawomir Laba <slawomirx.laba@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      a8417330
    • Stefan Assmann's avatar
      iavf: remove INITIAL_MAC_SET to allow gARP to work properly · bb861c14
      Stefan Assmann authored
      IAVF_FLAG_INITIAL_MAC_SET prevents waiting on iavf_is_mac_set_handled()
      the first time the MAC is set. This breaks gratuitous ARP because the
      MAC address has not been updated yet when the gARP packet is sent out.
      
      Current behaviour:
      $ echo 1 > /sys/class/net/ens4f0/device/sriov_numvfs
      iavf 0000:88:02.0: MAC address: ee:04:19:14:ec:ea
      $ ip addr add 192.168.1.1/24 dev ens4f0v0
      $ ip link set dev ens4f0v0 up
      $ echo 1 > /proc/sys/net/ipv4/conf/ens4f0v0/arp_notify
      $ ip link set ens4f0v0 addr 00:11:22:33:44:55
      07:23:41.676611 ee:04:19:14:ec:ea > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.1, length 28
      
      With IAVF_FLAG_INITIAL_MAC_SET removed:
      $ echo 1 > /sys/class/net/ens4f0/device/sriov_numvfs
      iavf 0000:88:02.0: MAC address: 3e:8a:16:a2:37:6d
      $ ip addr add 192.168.1.1/24 dev ens4f0v0
      $ ip link set dev ens4f0v0 up
      $ echo 1 > /proc/sys/net/ipv4/conf/ens4f0v0/arp_notify
      $ ip link set ens4f0v0 addr 00:11:22:33:44:55
      07:28:01.836608 00:11:22:33:44:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.1, length 28
      
      Fixes: 35a2443d ("iavf: Add waiting for response from PF in set mac")
      Signed-off-by: default avatarStefan Assmann <sassmann@kpanic.de>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      bb861c14
    • Ivan Vecera's avatar
      iavf: Do not restart Tx queues after reset task failure · 08f1c147
      Ivan Vecera authored
      After commit aa626da9 ("iavf: Detach device during reset task")
      the device is detached during reset task and re-attached at its end.
      The problem occurs when reset task fails because Tx queues are
      restarted during device re-attach and this leads later to a crash.
      
      To resolve this issue properly close the net device in cause of
      failure in reset task to avoid restarting of tx queues at the end.
      Also replace the hacky manipulation with IFF_UP flag by device close
      that clears properly both IFF_UP and __LINK_STATE_START flags.
      In these case iavf_close() does not do anything because the adapter
      state is already __IAVF_DOWN.
      
      Reproducer:
      1) Run some Tx traffic (e.g. iperf3) over iavf interface
      2) Set VF trusted / untrusted in loop
      
      [root@host ~]# cat repro.sh
      
      PF=enp65s0f0
      IF=${PF}v0
      
      ip link set up $IF
      ip addr add 192.168.0.2/24 dev $IF
      sleep 1
      
      iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null &
      sleep 2
      
      while :; do
              ip link set $PF vf 0 trust on
              ip link set $PF vf 0 trust off
      done
      [root@host ~]# ./repro.sh
      
      Result:
      [ 2006.650969] iavf 0000:41:01.0: Failed to init adminq: -53
      [ 2006.675662] ice 0000:41:00.0: VF 0 is now trusted
      [ 2006.689997] iavf 0000:41:01.0: Reset task did not complete, VF disabled
      [ 2006.696611] iavf 0000:41:01.0: failed to allocate resources during reinit
      [ 2006.703209] ice 0000:41:00.0: VF 0 is now untrusted
      [ 2006.737011] ice 0000:41:00.0: VF 0 is now trusted
      [ 2006.764536] ice 0000:41:00.0: VF 0 is now untrusted
      [ 2006.768919] BUG: kernel NULL pointer dereference, address: 0000000000000b4a
      [ 2006.776358] #PF: supervisor read access in kernel mode
      [ 2006.781488] #PF: error_code(0x0000) - not-present page
      [ 2006.786620] PGD 0 P4D 0
      [ 2006.789152] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [ 2006.792903] ice 0000:41:00.0: VF 0 is now trusted
      [ 2006.793501] CPU: 4 PID: 0 Comm: swapper/4 Kdump: loaded Not tainted 6.1.0-rc3+ #2
      [ 2006.805668] Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.4 04/13/2022
      [ 2006.815915] RIP: 0010:iavf_xmit_frame_ring+0x96/0xf70 [iavf]
      [ 2006.821028] ice 0000:41:00.0: VF 0 is now untrusted
      [ 2006.821572] Code: 48 83 c1 04 48 c1 e1 04 48 01 f9 48 83 c0 10 6b 50 f8 55 c1 ea 14 45 8d 64 14 01 48 39 c8 75 eb 41 83 fc 07 0f 8f e9 08 00 00 <0f> b7 45 4a 0f b7 55 48 41 8d 74 24 05 31 c9 66 39 d0 0f 86 da 00
      [ 2006.845181] RSP: 0018:ffffb253004bc9e8 EFLAGS: 00010293
      [ 2006.850397] RAX: ffff9d154de45b00 RBX: ffff9d15497d52e8 RCX: ffff9d154de45b00
      [ 2006.856327] ice 0000:41:00.0: VF 0 is now trusted
      [ 2006.857523] RDX: 0000000000000000 RSI: 00000000000005a8 RDI: ffff9d154de45ac0
      [ 2006.857525] RBP: 0000000000000b00 R08: ffff9d159cb010ac R09: 0000000000000001
      [ 2006.857526] R10: ffff9d154de45940 R11: 0000000000000000 R12: 0000000000000002
      [ 2006.883600] R13: ffff9d1770838dc0 R14: 0000000000000000 R15: ffffffffc07b8380
      [ 2006.885840] ice 0000:41:00.0: VF 0 is now untrusted
      [ 2006.890725] FS:  0000000000000000(0000) GS:ffff9d248e900000(0000) knlGS:0000000000000000
      [ 2006.890727] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2006.909419] CR2: 0000000000000b4a CR3: 0000000c39c10002 CR4: 0000000000770ee0
      [ 2006.916543] PKRU: 55555554
      [ 2006.918254] ice 0000:41:00.0: VF 0 is now trusted
      [ 2006.919248] Call Trace:
      [ 2006.919250]  <IRQ>
      [ 2006.919252]  dev_hard_start_xmit+0x9e/0x1f0
      [ 2006.932587]  sch_direct_xmit+0xa0/0x370
      [ 2006.936424]  __dev_queue_xmit+0x7af/0xd00
      [ 2006.940429]  ip_finish_output2+0x26c/0x540
      [ 2006.944519]  ip_output+0x71/0x110
      [ 2006.947831]  ? __ip_finish_output+0x2b0/0x2b0
      [ 2006.952180]  __ip_queue_xmit+0x16d/0x400
      [ 2006.952721] ice 0000:41:00.0: VF 0 is now untrusted
      [ 2006.956098]  __tcp_transmit_skb+0xa96/0xbf0
      [ 2006.965148]  __tcp_retransmit_skb+0x174/0x860
      [ 2006.969499]  ? cubictcp_cwnd_event+0x40/0x40
      [ 2006.973769]  tcp_retransmit_skb+0x14/0xb0
      ...
      
      Fixes: aa626da9 ("iavf: Detach device during reset task")
      Cc: Jacob Keller <jacob.e.keller@intel.com>
      Cc: Patryk Piotrowski <patryk.piotrowski@intel.com>
      Cc: SlawomirX Laba <slawomirx.laba@intel.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      08f1c147
    • Ivan Vecera's avatar
      iavf: Fix a crash during reset task · c678669d
      Ivan Vecera authored
      Recent commit aa626da9 ("iavf: Detach device during reset task")
      removed netif_tx_stop_all_queues() with an assumption that Tx queues
      are already stopped by netif_device_detach() in the beginning of
      reset task. This assumption is incorrect because during reset
      task a potential link event can start Tx queues again.
      Revert this change to fix this issue.
      
      Reproducer:
      1. Run some Tx traffic (e.g. iperf3) over iavf interface
      2. Switch MTU of this interface in a loop
      
      [root@host ~]# cat repro.sh
      
      IF=enp2s0f0v0
      
      iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null &
      sleep 2
      
      while :; do
              for i in 1280 1500 2000 900 ; do
                      ip link set $IF mtu $i
                      sleep 2
              done
      done
      [root@host ~]# ./repro.sh
      
      Result:
      [  306.199917] iavf 0000:02:02.0 enp2s0f0v0: NIC Link is Up Speed is 40 Gbps Full Duplex
      [  308.205944] iavf 0000:02:02.0 enp2s0f0v0: NIC Link is Up Speed is 40 Gbps Full Duplex
      [  310.103223] BUG: kernel NULL pointer dereference, address: 0000000000000008
      [  310.110179] #PF: supervisor write access in kernel mode
      [  310.115396] #PF: error_code(0x0002) - not-present page
      [  310.120526] PGD 0 P4D 0
      [  310.123057] Oops: 0002 [#1] PREEMPT SMP NOPTI
      [  310.127408] CPU: 24 PID: 183 Comm: kworker/u64:9 Kdump: loaded Not tainted 6.1.0-rc3+ #2
      [  310.135485] Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.4 04/13/2022
      [  310.145728] Workqueue: iavf iavf_reset_task [iavf]
      [  310.150520] RIP: 0010:iavf_xmit_frame_ring+0xd1/0xf70 [iavf]
      [  310.156180] Code: d0 0f 86 da 00 00 00 83 e8 01 0f b7 fa 29 f8 01 c8 39 c6 0f 8f a0 08 00 00 48 8b 45 20 48 8d 14 92 bf 01 00 00 00 4c 8d 3c d0 <49> 89 5f 08 8b 43 70 66 41 89 7f 14 41 89 47 10 f6 83 82 00 00 00
      [  310.174918] RSP: 0018:ffffbb5f0082caa0 EFLAGS: 00010293
      [  310.180137] RAX: 0000000000000000 RBX: ffff92345471a6e8 RCX: 0000000000000200
      [  310.187259] RDX: 0000000000000000 RSI: 000000000000000d RDI: 0000000000000001
      [  310.194385] RBP: ffff92341d249000 R08: ffff92434987fcac R09: 0000000000000001
      [  310.201509] R10: 0000000011f683b9 R11: 0000000011f50641 R12: 0000000000000008
      [  310.208631] R13: ffff923447500000 R14: 0000000000000000 R15: 0000000000000000
      [  310.215756] FS:  0000000000000000(0000) GS:ffff92434ee00000(0000) knlGS:0000000000000000
      [  310.223835] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  310.229572] CR2: 0000000000000008 CR3: 0000000fbc210004 CR4: 0000000000770ee0
      [  310.236696] PKRU: 55555554
      [  310.239399] Call Trace:
      [  310.241844]  <IRQ>
      [  310.243855]  ? dst_alloc+0x5b/0xb0
      [  310.247260]  dev_hard_start_xmit+0x9e/0x1f0
      [  310.251439]  sch_direct_xmit+0xa0/0x370
      [  310.255276]  __qdisc_run+0x13e/0x580
      [  310.258848]  __dev_queue_xmit+0x431/0xd00
      [  310.262851]  ? selinux_ip_postroute+0x147/0x3f0
      [  310.267377]  ip_finish_output2+0x26c/0x540
      
      Fixes: aa626da9 ("iavf: Detach device during reset task")
      Cc: Jacob Keller <jacob.e.keller@intel.com>
      Cc: Patryk Piotrowski <patryk.piotrowski@intel.com>
      Cc: SlawomirX Laba <slawomirx.laba@intel.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      c678669d
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: do not set up extensions for end interval · 33c7aba0
      Pablo Neira Ayuso authored
      Elements with an end interval flag set on do not store extensions. The
      global set definition is currently setting on the timeout and stateful
      expression for end interval elements.
      
      This leads to skipping end interval elements from the set->ops->walk()
      path as the expired check bogusly reports true.
      
      Moreover, do not set up stateful expressions for elements with end
      interval flag set on since this is never used.
      
      Fixes: 65038428 ("netfilter: nf_tables: allow to specify stateful expression in set definition")
      Fixes: 8d8540c4 ("netfilter: nft_set_rbtree: add timeout support")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      33c7aba0
    • Daniel Xu's avatar
      netfilter: conntrack: Fix data-races around ct mark · 52d1aa8b
      Daniel Xu authored
      nf_conn:mark can be read from and written to in parallel. Use
      READ_ONCE()/WRITE_ONCE() for reads and writes to prevent unwanted
      compiler optimizations.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDaniel Xu <dxu@dxuuu.xyz>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      52d1aa8b
    • Wang Hai's avatar
      net: pch_gbe: fix potential memleak in pch_gbe_tx_queue() · 2360f9b8
      Wang Hai authored
      In pch_gbe_xmit_frame(), NETDEV_TX_OK will be returned whether
      pch_gbe_tx_queue() sends data successfully or not, so pch_gbe_tx_queue()
      needs to free skb before returning. But pch_gbe_tx_queue() returns without
      freeing skb in case of dma_map_single() fails. Add dev_kfree_skb_any()
      to fix it.
      
      Fixes: 77555ee7 ("net: Add Gigabit Ethernet driver of Topcliff PCH")
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2360f9b8
    • Lin Ma's avatar
      nfc/nci: fix race with opening and closing · 0ad6bded
      Lin Ma authored
      Previously we leverage NCI_UNREG and the lock inside nci_close_device to
      prevent the race condition between opening a device and closing a
      device. However, it still has problem because a failed opening command
      will erase the NCI_UNREG flag and allow another opening command to
      bypass the status checking.
      
      This fix corrects that by making sure the NCI_UNREG is held.
      
      Reported-by: syzbot+43475bf3cfbd6e41f5b7@syzkaller.appspotmail.com
      Fixes: 48b71a9e ("NFC: add NCI_UNREG flag to eliminate the race")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ad6bded
    • Vladimir Oltean's avatar
      net: dsa: sja1105: disallow C45 transactions on the BASE-TX MDIO bus · 24deec6b
      Vladimir Oltean authored
      You'd think people know that the internal 100BASE-TX PHY on the SJA1110
      responds only to clause 22 MDIO transactions, but they don't :)
      
      When a clause 45 transaction is attempted, sja1105_base_tx_mdio_read()
      and sja1105_base_tx_mdio_write() don't expect "reg" to contain bit 30
      set (MII_ADDR_C45) and pack this value into the SPI transaction buffer.
      
      But the field in the SPI buffer has a width smaller than 30 bits, so we
      see this confusing message from the packing() API rather than a proper
      rejection of C45 transactions:
      
      Call trace:
       dump_stack+0x1c/0x38
       sja1105_pack+0xbc/0xc0 [sja1105]
       sja1105_xfer+0x114/0x2b0 [sja1105]
       sja1105_xfer_u32+0x44/0xf4 [sja1105]
       sja1105_base_tx_mdio_read+0x44/0x7c [sja1105]
       mdiobus_read+0x44/0x80
       get_phy_c45_ids+0x70/0x234
       get_phy_device+0x68/0x15c
       fwnode_mdiobus_register_phy+0x74/0x240
       of_mdiobus_register+0x13c/0x380
       sja1105_mdiobus_register+0x368/0x490 [sja1105]
       sja1105_setup+0x94/0x119c [sja1105]
      Cannot store 401d2405 inside bits 24-4 (would truncate)
      
      Fixes: 5a8f0974 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24deec6b
    • David Howells's avatar
      rxrpc: Fix race between conn bundle lookup and bundle removal [ZDI-CAN-15975] · 3bcd6c7e
      David Howells authored
      After rxrpc_unbundle_conn() has removed a connection from a bundle, it
      checks to see if there are any conns with available channels and, if not,
      removes and attempts to destroy the bundle.
      
      Whilst it does check after grabbing client_bundles_lock that there are no
      connections attached, this races with rxrpc_look_up_bundle() retrieving the
      bundle, but not attaching a connection for the connection to be attached
      later.
      
      There is therefore a window in which the bundle can get destroyed before we
      manage to attach a new connection to it.
      
      Fix this by adding an "active" counter to struct rxrpc_bundle:
      
       (1) rxrpc_connect_call() obtains an active count by prepping/looking up a
           bundle and ditches it before returning.
      
       (2) If, during rxrpc_connect_call(), a connection is added to the bundle,
           this obtains an active count, which is held until the connection is
           discarded.
      
       (3) rxrpc_deactivate_bundle() is created to drop an active count on a
           bundle and destroy it when the active count reaches 0.  The active
           count is checked inside client_bundles_lock() to prevent a race with
           rxrpc_look_up_bundle().
      
       (4) rxrpc_unbundle_conn() then calls rxrpc_deactivate_bundle().
      
      Fixes: 245500d8 ("rxrpc: Rewrite the client connection manager")
      Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-15975
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: zdi-disclosures@trendmicro.com
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3bcd6c7e
    • Wang Yufen's avatar
      selftests/net: fix missing xdp_dummy · 302e57f8
      Wang Yufen authored
      After commit afef88e6 ("selftests/bpf: Store BPF object files with
      .bpf.o extension"), we should use xdp_dummy.bpf.o instade of xdp_dummy.o.
      
      In addition, use the BPF_FILE variable to save the BPF object file name,
      which can be better identified and modified.
      
      Fixes: afef88e6 ("selftests/bpf: Store BPF object files with .bpf.o extension")
      Signed-off-by: default avatarWang Yufen <wangyufen@huawei.com>
      Cc: Daniel Müller <deso@posteo.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      302e57f8
    • Mahesh Bandewar's avatar
      ipvlan: hold lower dev to avoid possible use-after-free · 40b9d1ab
      Mahesh Bandewar authored
      Recently syzkaller discovered the issue of disappearing lower
      device (NETDEV_UNREGISTER) while the virtual device (like
      macvlan) is still having it as a lower device. So it's just
      a matter of time similar discovery will be made for IPvlan
      device setup. So fixing it preemptively. Also while at it,
      add a refcount tracker.
      
      Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40b9d1ab
    • Thomas Zeitlhofer's avatar
      net: neigh: decrement the family specific qlen · 8207f253
      Thomas Zeitlhofer authored
      Commit 0ff4eb3d ("neighbour: make proxy_queue.qlen limit
      per-device") introduced the length counter qlen in struct neigh_parms.
      There are separate neigh_parms instances for IPv4/ARP and IPv6/ND, and
      while the family specific qlen is incremented in pneigh_enqueue(), the
      mentioned commit decrements always the IPv4/ARP specific qlen,
      regardless of the currently processed family, in pneigh_queue_purge()
      and neigh_proxy_process().
      
      As a result, with IPv6/ND, the family specific qlen is only incremented
      (and never decremented) until it exceeds PROXY_QLEN, and then, according
      to the check in pneigh_enqueue(), neighbor solicitations are not
      answered anymore. As an example, this is noted when using the
      subnet-router anycast address to access a Linux router. After a certain
      amount of time (in the observed case, qlen exceeded PROXY_QLEN after two
      days), the Linux router stops answering neighbor solicitations for its
      subnet-router anycast address and effectively becomes unreachable.
      
      Another result with IPv6/ND is that the IPv4/ARP specific qlen is
      decremented more often than incremented. This leads to negative qlen
      values, as a signed integer has been used for the length counter qlen,
      and potentially to an integer overflow.
      
      Fix this by introducing the helper function neigh_parms_qlen_dec(),
      which decrements the family specific qlen. Thereby, make use of the
      existing helper function neigh_get_dev_parms_rcu(), whose definition
      therefore needs to be placed earlier in neighbour.c. Take the family
      member from struct neigh_table to determine the currently processed
      family and appropriately call neigh_parms_qlen_dec() from
      pneigh_queue_purge() and neigh_proxy_process().
      
      Additionally, use an unsigned integer for the length counter qlen.
      
      Fixes: 0ff4eb3d ("neighbour: make proxy_queue.qlen limit per-device")
      Signed-off-by: default avatarThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8207f253