1. 28 Apr, 2019 4 commits
    • Kalle Valo's avatar
      Merge tag 'iwlwifi-for-kalle-2019-04-28' of... · 5c403533
      Kalle Valo authored
      Merge tag 'iwlwifi-for-kalle-2019-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
      
      Fourth batch of patches intended for v5.1
      
      * Fix an oops when we receive a packet with bogus lengths;
      * Fix a bug that prevented 5350 devices from working;
      * Fix a small merge damage from the previous series;
      5c403533
    • Luca Coelho's avatar
      iwlwifi: mvm: fix merge damage in iwl_mvm_vif_dbgfs_register() · d156e67d
      Luca Coelho authored
      When I rebased Greg's patch, I accidentally left the old if block that
      was already there.  Remove it.
      
      Fixes: 154d4899 ("iwlwifi: mvm: properly check debugfs dentry before using it")
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      d156e67d
    • Emmanuel Grumbach's avatar
      iwlwifi: fix driver operation for 5350 · 5c9adef9
      Emmanuel Grumbach authored
      We introduced a bug that prevented this old device from
      working. The driver would simply not be able to complete
      the INIT flow while spewing this warning:
      
       CSR addresses aren't configured
       WARNING: CPU: 0 PID: 819 at drivers/net/wireless/intel/iwlwifi/pcie/drv.c:917
       iwl_pci_probe+0x160/0x1e0 [iwlwifi]
      
      Cc: stable@vger.kernel.org # v4.18+
      Fixes: a8cbb46f ("iwlwifi: allow different csr flags for different device families")
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Fixes: c8f1b51e ("iwlwifi: allow different csr flags for different device families")
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      5c9adef9
    • Luca Coelho's avatar
      iwlwifi: mvm: check for length correctness in iwl_mvm_create_skb() · de1887c0
      Luca Coelho authored
      We don't check for the validity of the lengths in the packet received
      from the firmware.  If the MPDU length received in the rx descriptor
      is too short to contain the header length and the crypt length
      together, we may end up trying to copy a negative number of bytes
      (headlen - hdrlen < 0) which will underflow and cause us to try to
      copy a huge amount of data.  This causes oopses such as this one:
      
      BUG: unable to handle kernel paging request at ffff896be2970000
      PGD 5e201067 P4D 5e201067 PUD 5e205067 PMD 16110d063 PTE 8000000162970161
      Oops: 0003 [#1] PREEMPT SMP NOPTI
      CPU: 2 PID: 1824 Comm: irq/134-iwlwifi Not tainted 4.19.33-04308-geea41cf4930f #1
      Hardware name: [...]
      RIP: 0010:memcpy_erms+0x6/0x10
      Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3
       0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
      RSP: 0018:ffffa4630196fc60 EFLAGS: 00010287
      RAX: ffff896be2924618 RBX: ffff896bc8ecc600 RCX: 00000000fffb4610
      RDX: 00000000fffffff8 RSI: ffff896a835e2a38 RDI: ffff896be2970000
      RBP: ffffa4630196fd30 R08: ffff896bc8ecc600 R09: ffff896a83597000
      R10: ffff896bd6998400 R11: 000000000200407f R12: ffff896a83597050
      R13: 00000000fffffff8 R14: 0000000000000010 R15: ffff896a83597038
      FS:  0000000000000000(0000) GS:ffff896be8280000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffff896be2970000 CR3: 000000005dc12002 CR4: 00000000003606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       iwl_mvm_rx_mpdu_mq+0xb51/0x121b [iwlmvm]
       iwl_pcie_rx_handle+0x58c/0xa89 [iwlwifi]
       iwl_pcie_irq_rx_msix_handler+0xd9/0x12a [iwlwifi]
       irq_thread_fn+0x24/0x49
       irq_thread+0xb0/0x122
       kthread+0x138/0x140
       ret_from_fork+0x1f/0x40
      
      Fix that by checking the lengths for correctness and trigger a warning
      to show that we have received wrong data.
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      de1887c0
  2. 25 Apr, 2019 1 commit
  3. 24 Apr, 2019 1 commit
  4. 18 Apr, 2019 5 commits
  5. 16 Apr, 2019 1 commit
  6. 15 Apr, 2019 5 commits
  7. 14 Apr, 2019 8 commits
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2019-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 73248801
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2019-04-09
      
      This series provides some fixes to mlx5 driver.
      
      I've cc'ed some of the checksum fixes to Eric Dumazet and i would like to get
      his feedback before you pull.
      
      For -stable v4.19
      ('net/mlx5: FPGA, tls, idr remove on flow delete')
      ('net/mlx5: FPGA, tls, hold rcu read lock a bit longer')
      
      For -stable v4.20
      ('net/mlx5e: Rx, Check ip headers sanity')
      ('Revert "net/mlx5e: Enable reporting checksum unnecessary also for L3 packets"')
      ('net/mlx5e: Rx, Fixup skb checksum for packets with tail padding')
      
      For -stable v5.0
      ('net/mlx5e: Switch to Toeplitz RSS hash by default')
      ('net/mlx5e: Protect against non-uplink representor for encap')
      ('net/mlx5e: XDP, Avoid checksum complete when XDP prog is loaded')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73248801
    • Eric Dumazet's avatar
      rtnetlink: fix rtnl_valid_stats_req() nlmsg_len check · 69f23a09
      Eric Dumazet authored
      Jakub forgot to either use nlmsg_len() or nlmsg_msg_size(),
      allowing KMSAN to detect a possible uninit-value in rtnl_stats_get
      
      BUG: KMSAN: uninit-value in rtnl_stats_get+0x6d9/0x11d0 net/core/rtnetlink.c:4997
      CPU: 0 PID: 10428 Comm: syz-executor034 Not tainted 5.1.0-rc2+ #24
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x173/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x131/0x2a0 mm/kmsan/kmsan.c:619
       __msan_warning+0x7a/0xf0 mm/kmsan/kmsan_instr.c:310
       rtnl_stats_get+0x6d9/0x11d0 net/core/rtnetlink.c:4997
       rtnetlink_rcv_msg+0x115b/0x1550 net/core/rtnetlink.c:5192
       netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2485
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5210
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1925
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg net/socket.c:632 [inline]
       ___sys_sendmsg+0xdb3/0x1220 net/socket.c:2137
       __sys_sendmsg net/socket.c:2175 [inline]
       __do_sys_sendmsg net/socket.c:2184 [inline]
       __se_sys_sendmsg+0x305/0x460 net/socket.c:2182
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2182
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Fixes: 51bc860d ("rtnetlink: stats: validate attributes in get as well as dumps")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69f23a09
    • David S. Miller's avatar
      Merge branch 'qed-doorbell-overflow-recovery' · a6b16d8d
      David S. Miller authored
      Denis Bolotin says:
      
      ====================
      qed: Fix the Doorbell Overflow Recovery mechanism
      
      This patch series fixes and improves the doorbell recovery mechanism.
      The main goals of this series are to fix missing attentions from the
      doorbells block (DORQ) or not handling them properly, and execute the
      recovery from periodic handler instead of the attention handler.
      
      Please consider applying the series to net.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6b16d8d
    • Denis Bolotin's avatar
      qed: Fix the DORQ's attentions handling · 0d72c2ac
      Denis Bolotin authored
      Separate the overflow handling from the hardware interrupt status analysis.
      The interrupt status is a single register and is common for all PFs. The
      first PF reading the register is not necessarily the one who overflowed.
      All PFs must check their overflow status on every attention.
      In this change we clear the sticky indication in the attention handler to
      allow doorbells to be processed again as soon as possible, but running
      the doorbell recovery is scheduled for the periodic handler to reduce the
      time spent in the attention handler.
      Checking the need for DORQ flush was changed to "db_bar_no_edpm" because
      qed_edpm_enabled()'s result could change dynamically and might have
      prevented a needed flush.
      Signed-off-by: default avatarDenis Bolotin <dbolotin@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d72c2ac
    • Denis Bolotin's avatar
      qed: Fix missing DORQ attentions · d4476b8a
      Denis Bolotin authored
      When the DORQ (doorbell block) is overflowed, all PFs get attentions at the
      same time. If one PF finished handling the attention before another PF even
      started, the second PF might miss the DORQ's attention bit and not handle
      the attention at all.
      If the DORQ attention is missed and the issue is not resolved, another
      attention will not be sent, therefore each attention is treated as a
      potential DORQ attention.
      As a result, the attention callback is called more frequently so the debug
      print was moved to reduce its quantity.
      The number of periodic doorbell recovery handler schedules was reduced
      because it was the previous way to mitigating the missed attention issue.
      Signed-off-by: default avatarDenis Bolotin <dbolotin@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4476b8a
    • Denis Bolotin's avatar
      qed: Fix the doorbell address sanity check · b61b04ad
      Denis Bolotin authored
      Fix the condition which verifies that doorbell address is inside the
      doorbell bar by checking that the end of the address is within range
      as well.
      Signed-off-by: default avatarDenis Bolotin <dbolotin@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b61b04ad
    • Denis Bolotin's avatar
      qed: Delete redundant doorbell recovery types · 9ac6bb14
      Denis Bolotin authored
      DB_REC_DRY_RUN (running doorbell recovery without sending doorbells) is
      never used. DB_REC_ONCE (send a single doorbell from the doorbell recovery)
      is not needed anymore because by running the periodic handler we make sure
      we check the overflow status later instead.
      This patch is needed because in the next patches, the only doorbell
      recovery type being used is DB_REC_REAL_DEAL, and the fixes are much
      cleaner without this enum.
      Signed-off-by: default avatarDenis Bolotin <dbolotin@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ac6bb14
    • Eric Dumazet's avatar
      ipv4: ensure rcu_read_lock() in ipv4_link_failure() · c543cb4a
      Eric Dumazet authored
      fib_compute_spec_dst() needs to be called under rcu protection.
      
      syzbot reported :
      
      WARNING: suspicious RCU usage
      5.1.0-rc4+ #165 Not tainted
      include/linux/inetdevice.h:220 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by swapper/0/0:
       #0: 0000000051b67925 ((&n->timer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:170 [inline]
       #0: 0000000051b67925 ((&n->timer)){+.-.}, at: call_timer_fn+0xda/0x720 kernel/time/timer.c:1315
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-rc4+ #165
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5162
       __in_dev_get_rcu include/linux/inetdevice.h:220 [inline]
       fib_compute_spec_dst+0xbbd/0x1030 net/ipv4/fib_frontend.c:294
       spec_dst_fill net/ipv4/ip_options.c:245 [inline]
       __ip_options_compile+0x15a7/0x1a10 net/ipv4/ip_options.c:343
       ipv4_link_failure+0x172/0x400 net/ipv4/route.c:1195
       dst_link_failure include/net/dst.h:427 [inline]
       arp_error_report+0xd1/0x1c0 net/ipv4/arp.c:297
       neigh_invalidate+0x24b/0x570 net/core/neighbour.c:995
       neigh_timer_handler+0xc35/0xf30 net/core/neighbour.c:1081
       call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
       expire_timers kernel/time/timer.c:1362 [inline]
       __run_timers kernel/time/timer.c:1681 [inline]
       __run_timers kernel/time/timer.c:1649 [inline]
       run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
       __do_softirq+0x266/0x95a kernel/softirq.c:293
       invoke_softirq kernel/softirq.c:374 [inline]
       irq_exit+0x180/0x1d0 kernel/softirq.c:414
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
      
      Fixes: ed0de45a ("ipv4: recompile ip options in ipv4_link_failure")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c543cb4a
  8. 13 Apr, 2019 1 commit
  9. 12 Apr, 2019 14 commits
    • David S. Miller's avatar
      Merge branch 'rxrpc-fixes' · 9e550f01
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Fixes
      
      Here is a collection of fixes for rxrpc:
      
       (1) rxrpc_error_report() needs to call sock_error() to clear the error
           code from the UDP transport socket, lest it be unexpectedly revisited
           on the next kernel_sendmsg() call.  This has been causing all sorts of
           weird effects in AFS as the effects have typically been felt by the
           wrong RxRPC call.
      
       (2) Allow a kernel user of AF_RXRPC to easily detect if an rxrpc call has
           completed.
      
       (3) Allow errors incurred by attempting to transmit data through the UDP
           socket to get back up the stack to AFS.
      
       (4) Make AFS use (2) to abort the synchronous-mode call waiting loop if
           the rxrpc-level call completed.
      
       (5) Add a missing tracepoint case for tracing abort reception.
      
       (6) Fix detection and handling of out-of-order ACKs.
      
      ====================
      Tested-by: default avatarJonathan Billings <jsbillin@umich.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e550f01
    • Jeffrey Altman's avatar
      rxrpc: Fix detection of out of order acks · 1a2391c3
      Jeffrey Altman authored
      The rxrpc packet serial number cannot be safely used to compute out of
      order ack packets for several reasons:
      
       1. The allocation of serial numbers cannot be assumed to imply the order
          by which acks are populated and transmitted.  In some rxrpc
          implementations, delayed acks and ping acks are transmitted
          asynchronously to the receipt of data packets and so may be transmitted
          out of order.  As a result, they can race with idle acks.
      
       2. Serial numbers are allocated by the rxrpc connection and not the call
          and as such may wrap independently if multiple channels are in use.
      
      In any case, what matters is whether the ack packet provides new
      information relating to the bounds of the window (the firstPacket and
      previousPacket in the ACK data).
      
      Fix this by discarding packets that appear to wind back the window bounds
      rather than on serial number procession.
      
      Fixes: 298bc15b ("rxrpc: Only take the rwind and mtu values from latest ACK")
      Signed-off-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a2391c3
    • David Howells's avatar
      rxrpc: Trace received connection aborts · 39ce6755
      David Howells authored
      Trace received calls that are aborted due to a connection abort, typically
      because of authentication failure.  Without this, connection aborts don't
      show up in the trace log.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39ce6755
    • Marc Dionne's avatar
      afs: Check for rxrpc call completion in wait loop · f7f1dd31
      Marc Dionne authored
      Check the state of the rxrpc call backing an afs call in each iteration of
      the call wait loop in case the rxrpc call has already been terminated at
      the rxrpc layer.
      
      Interrupt the wait loop and mark the afs call as complete if the rxrpc
      layer call is complete.
      
      There were cases where rxrpc errors were not passed up to afs, which could
      result in this loop waiting forever for an afs call to transition to
      AFS_CALL_COMPLETE while the rx call was already complete.
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7f1dd31
    • Marc Dionne's avatar
      rxrpc: Allow errors to be returned from rxrpc_queue_packet() · 8e8715aa
      Marc Dionne authored
      Change rxrpc_queue_packet()'s signature so that it can return any error
      code it may encounter when trying to send the packet.
      
      This allows the caller to eventually do something in case of error - though
      it should be noted that the packet has been queued and a resend is
      scheduled.
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e8715aa
    • Marc Dionne's avatar
      rxrpc: Make rxrpc_kernel_check_life() indicate if call completed · 4611da30
      Marc Dionne authored
      Make rxrpc_kernel_check_life() pass back the life counter through the
      argument list and return true if the call has not yet completed.
      Suggested-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4611da30
    • Marc Dionne's avatar
      rxrpc: Clear socket error · 56d282d9
      Marc Dionne authored
      When an ICMP or ICMPV6 error is received, the error will be attached
      to the socket (sk_err) and the report function will get called.
      Clear any pending error here by calling sock_error().
      
      This would cause the following attempt to use the socket to fail with
      the error code stored by the ICMP error, resulting in unexpected errors
      with various side effects depending on the context.
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarJonathan Billings <jsbillin@umich.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56d282d9
    • Colin Ian King's avatar
      qede: fix write to free'd pointer error and double free of ptp · 1dc2b3d6
      Colin Ian King authored
      The err2 error return path calls qede_ptp_disable that cleans up
      on an error and frees ptp. After this, the free'd ptp is dereferenced
      when ptp->clock is set to NULL and the code falls-through to error
      path err1 that frees ptp again.
      
      Fix this by calling qede_ptp_disable and exiting via an error
      return path that does not set ptp->clock or kfree ptp.
      
      Addresses-Coverity: ("Write to pointer after free")
      Fixes: 03574497 ("qede: Add support for PTP resource locking.")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1dc2b3d6
    • Colin Ian King's avatar
      vxge: fix return of a free'd memblock on a failed dma mapping · 0a2c34f1
      Colin Ian King authored
      Currently if a pci dma mapping failure is detected a free'd
      memblock address is returned rather than a NULL (that indicates
      an error). Fix this by ensuring NULL is returned on this error case.
      
      Addresses-Coverity: ("Use after free")
      Fixes: 528f7272 ("vxge: code cleanup and reorganization")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a2c34f1
    • Kalle Valo's avatar
      Merge tag 'iwlwifi-for-kalle-2019-04-03' of... · 832bc250
      Kalle Valo authored
      Merge tag 'iwlwifi-for-kalle-2019-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
      
      Second batch of iwlwifi fixes intended for v5.1
      
      * fix for a potential deadlock in the TX path;
      * a fix for offloaded rate-control;
      * support new PCI HW IDs which use a new FW;
      832bc250
    • Stanislaw Gruszka's avatar
      mt76x02: avoid status_list.lock and sta->rate_ctrl_lock dependency · bafdf85d
      Stanislaw Gruszka authored
      Move ieee80211_tx_status_ext() outside of status_list lock section
      in order to avoid locking dependency and possible deadlock reposed by
      LOCKDEP in below warning.
      
      Also do mt76_tx_status_lock() just before it's needed.
      
      [  440.224832] WARNING: possible circular locking dependency detected
      [  440.224833] 5.1.0-rc2+ #22 Not tainted
      [  440.224834] ------------------------------------------------------
      [  440.224835] kworker/u16:28/2362 is trying to acquire lock:
      [  440.224836] 0000000089b8cacf (&(&q->lock)->rlock#2){+.-.}, at: mt76_wake_tx_queue+0x4c/0xb0 [mt76]
      [  440.224842]
                     but task is already holding lock:
      [  440.224842] 000000002cfedc59 (&(&sta->lock)->rlock){+.-.}, at: ieee80211_stop_tx_ba_cb+0x32/0x1f0 [mac80211]
      [  440.224863]
                     which lock already depends on the new lock.
      
      [  440.224863]
                     the existing dependency chain (in reverse order) is:
      [  440.224864]
                     -> #3 (&(&sta->lock)->rlock){+.-.}:
      [  440.224869]        _raw_spin_lock_bh+0x34/0x40
      [  440.224880]        ieee80211_start_tx_ba_session+0xe4/0x3d0 [mac80211]
      [  440.224894]        minstrel_ht_get_rate+0x45c/0x510 [mac80211]
      [  440.224906]        rate_control_get_rate+0xc1/0x140 [mac80211]
      [  440.224918]        ieee80211_tx_h_rate_ctrl+0x195/0x3c0 [mac80211]
      [  440.224930]        ieee80211_xmit_fast+0x26d/0xa50 [mac80211]
      [  440.224942]        __ieee80211_subif_start_xmit+0xfc/0x310 [mac80211]
      [  440.224954]        ieee80211_subif_start_xmit+0x38/0x390 [mac80211]
      [  440.224956]        dev_hard_start_xmit+0xb8/0x300
      [  440.224957]        __dev_queue_xmit+0x7d4/0xbb0
      [  440.224968]        ip6_finish_output2+0x246/0x860 [ipv6]
      [  440.224978]        mld_sendpack+0x1bd/0x360 [ipv6]
      [  440.224987]        mld_ifc_timer_expire+0x1a4/0x2f0 [ipv6]
      [  440.224989]        call_timer_fn+0x89/0x2a0
      [  440.224990]        run_timer_softirq+0x1bd/0x4d0
      [  440.224992]        __do_softirq+0xdb/0x47c
      [  440.224994]        irq_exit+0xfa/0x100
      [  440.224996]        smp_apic_timer_interrupt+0x9a/0x220
      [  440.224997]        apic_timer_interrupt+0xf/0x20
      [  440.224999]        cpuidle_enter_state+0xc1/0x470
      [  440.225000]        do_idle+0x21a/0x260
      [  440.225001]        cpu_startup_entry+0x19/0x20
      [  440.225004]        start_secondary+0x135/0x170
      [  440.225006]        secondary_startup_64+0xa4/0xb0
      [  440.225007]
                     -> #2 (&(&sta->rate_ctrl_lock)->rlock){+.-.}:
      [  440.225009]        _raw_spin_lock_bh+0x34/0x40
      [  440.225022]        rate_control_tx_status+0x4f/0xb0 [mac80211]
      [  440.225031]        ieee80211_tx_status_ext+0x142/0x1a0 [mac80211]
      [  440.225035]        mt76x02_send_tx_status+0x2e4/0x340 [mt76x02_lib]
      [  440.225037]        mt76x02_tx_status_data+0x31/0x40 [mt76x02_lib]
      [  440.225040]        mt76u_tx_status_data+0x51/0xa0 [mt76_usb]
      [  440.225042]        process_one_work+0x237/0x5d0
      [  440.225043]        worker_thread+0x3c/0x390
      [  440.225045]        kthread+0x11d/0x140
      [  440.225046]        ret_from_fork+0x3a/0x50
      [  440.225047]
                     -> #1 (&(&list->lock)->rlock#8){+.-.}:
      [  440.225049]        _raw_spin_lock_bh+0x34/0x40
      [  440.225052]        mt76_tx_status_skb_add+0x51/0x100 [mt76]
      [  440.225054]        mt76x02u_tx_prepare_skb+0xbd/0x116 [mt76x02_usb]
      [  440.225056]        mt76u_tx_queue_skb+0x5f/0x180 [mt76_usb]
      [  440.225058]        mt76_tx+0x93/0x190 [mt76]
      [  440.225070]        ieee80211_tx_frags+0x148/0x210 [mac80211]
      [  440.225081]        __ieee80211_tx+0x75/0x1b0 [mac80211]
      [  440.225092]        ieee80211_tx+0xde/0x110 [mac80211]
      [  440.225105]        __ieee80211_tx_skb_tid_band+0x72/0x90 [mac80211]
      [  440.225122]        ieee80211_send_auth+0x1f3/0x360 [mac80211]
      [  440.225141]        ieee80211_auth.cold.40+0x6c/0x100 [mac80211]
      [  440.225156]        ieee80211_mgd_auth.cold.50+0x132/0x15f [mac80211]
      [  440.225171]        cfg80211_mlme_auth+0x149/0x360 [cfg80211]
      [  440.225181]        nl80211_authenticate+0x273/0x2e0 [cfg80211]
      [  440.225183]        genl_family_rcv_msg+0x196/0x3a0
      [  440.225184]        genl_rcv_msg+0x47/0x8e
      [  440.225185]        netlink_rcv_skb+0x3a/0xf0
      [  440.225187]        genl_rcv+0x24/0x40
      [  440.225188]        netlink_unicast+0x16d/0x210
      [  440.225189]        netlink_sendmsg+0x204/0x3b0
      [  440.225191]        sock_sendmsg+0x36/0x40
      [  440.225193]        ___sys_sendmsg+0x259/0x2b0
      [  440.225194]        __sys_sendmsg+0x47/0x80
      [  440.225196]        do_syscall_64+0x60/0x1f0
      [  440.225197]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  440.225198]
                     -> #0 (&(&q->lock)->rlock#2){+.-.}:
      [  440.225200]        lock_acquire+0xb9/0x1a0
      [  440.225202]        _raw_spin_lock_bh+0x34/0x40
      [  440.225204]        mt76_wake_tx_queue+0x4c/0xb0 [mt76]
      [  440.225215]        ieee80211_agg_start_txq+0xe8/0x2b0 [mac80211]
      [  440.225225]        ieee80211_stop_tx_ba_cb+0xb8/0x1f0 [mac80211]
      [  440.225235]        ieee80211_ba_session_work+0x1c1/0x2f0 [mac80211]
      [  440.225236]        process_one_work+0x237/0x5d0
      [  440.225237]        worker_thread+0x3c/0x390
      [  440.225239]        kthread+0x11d/0x140
      [  440.225240]        ret_from_fork+0x3a/0x50
      [  440.225240]
                     other info that might help us debug this:
      
      [  440.225241] Chain exists of:
                       &(&q->lock)->rlock#2 --> &(&sta->rate_ctrl_lock)->rlock --> &(&sta->lock)->rlock
      
      [  440.225243]  Possible unsafe locking scenario:
      
      [  440.225244]        CPU0                    CPU1
      [  440.225244]        ----                    ----
      [  440.225245]   lock(&(&sta->lock)->rlock);
      [  440.225245]                                lock(&(&sta->rate_ctrl_lock)->rlock);
      [  440.225246]                                lock(&(&sta->lock)->rlock);
      [  440.225247]   lock(&(&q->lock)->rlock#2);
      [  440.225248]
                      *** DEADLOCK ***
      
      [  440.225249] 5 locks held by kworker/u16:28/2362:
      [  440.225250]  #0: 0000000048fcd291 ((wq_completion)phy0){+.+.}, at: process_one_work+0x1b5/0x5d0
      [  440.225252]  #1: 00000000f1c6828f ((work_completion)(&sta->ampdu_mlme.work)){+.+.}, at: process_one_work+0x1b5/0x5d0
      [  440.225254]  #2: 00000000433d2b2c (&sta->ampdu_mlme.mtx){+.+.}, at: ieee80211_ba_session_work+0x5c/0x2f0 [mac80211]
      [  440.225265]  #3: 000000002cfedc59 (&(&sta->lock)->rlock){+.-.}, at: ieee80211_stop_tx_ba_cb+0x32/0x1f0 [mac80211]
      [  440.225276]  #4: 000000009d7b9a44 (rcu_read_lock){....}, at: ieee80211_agg_start_txq+0x33/0x2b0 [mac80211]
      [  440.225286]
                     stack backtrace:
      [  440.225288] CPU: 2 PID: 2362 Comm: kworker/u16:28 Not tainted 5.1.0-rc2+ #22
      [  440.225289] Hardware name: LENOVO 20KGS23S0P/20KGS23S0P, BIOS N23ET55W (1.30 ) 08/31/2018
      [  440.225300] Workqueue: phy0 ieee80211_ba_session_work [mac80211]
      [  440.225301] Call Trace:
      [  440.225304]  dump_stack+0x85/0xc0
      [  440.225306]  print_circular_bug.isra.38.cold.58+0x15c/0x195
      [  440.225307]  check_prev_add.constprop.48+0x5f0/0xc00
      [  440.225309]  ? check_prev_add.constprop.48+0x39d/0xc00
      [  440.225311]  ? __lock_acquire+0x41d/0x1100
      [  440.225312]  __lock_acquire+0xd98/0x1100
      [  440.225313]  ? __lock_acquire+0x41d/0x1100
      [  440.225315]  lock_acquire+0xb9/0x1a0
      [  440.225317]  ? mt76_wake_tx_queue+0x4c/0xb0 [mt76]
      [  440.225319]  _raw_spin_lock_bh+0x34/0x40
      [  440.225321]  ? mt76_wake_tx_queue+0x4c/0xb0 [mt76]
      [  440.225323]  mt76_wake_tx_queue+0x4c/0xb0 [mt76]
      [  440.225334]  ieee80211_agg_start_txq+0xe8/0x2b0 [mac80211]
      [  440.225344]  ieee80211_stop_tx_ba_cb+0xb8/0x1f0 [mac80211]
      [  440.225354]  ieee80211_ba_session_work+0x1c1/0x2f0 [mac80211]
      [  440.225356]  process_one_work+0x237/0x5d0
      [  440.225358]  worker_thread+0x3c/0x390
      [  440.225359]  ? wq_calc_node_cpumask+0x70/0x70
      [  440.225360]  kthread+0x11d/0x140
      [  440.225362]  ? kthread_create_on_node+0x40/0x40
      [  440.225363]  ret_from_fork+0x3a/0x50
      
      Cc: stable@vger.kernel.org
      Fixes: 88046b2c ("mt76: add support for reporting tx status with skb")
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Acked-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      bafdf85d
    • Vijayakumar Durai's avatar
      rt2x00: do not increment sequence number while re-transmitting · 746ba11f
      Vijayakumar Durai authored
      Currently rt2x00 devices retransmit the management frames with
      incremented sequence number if hardware is assigning the sequence.
      
      This is HW bug fixed already for non-QOS data frames, but it should
      be fixed for management frames except beacon.
      
      Without fix retransmitted frames have wrong SN:
      
       AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1648, FN=0, Flags=........C Frame is not being retransmitted 1648 1
       AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1649, FN=0, Flags=....R...C Frame is being retransmitted 1649 1
       AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1650, FN=0, Flags=....R...C Frame is being retransmitted 1650 1
      
      With the fix SN stays correctly the same:
      
       88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=........C
       88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C
       88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVijayakumar Durai <vijayakumar.durai1@vivint.com>
      [sgruszka: simplify code, change comments and changelog]
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      746ba11f
    • Felix Fietkau's avatar
      mt76: mt7603: send BAR after powersave wakeup · 9dc27bcb
      Felix Fietkau authored
      Now that the sequence number allocation is fixed, we can finally send a BAR
      at powersave wakeup time to refresh the receiver side reorder window
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      9dc27bcb
    • Felix Fietkau's avatar
      mt76: mt7603: fix sequence number assignment · aa3cb24b
      Felix Fietkau authored
      If the MT_TXD3_SN_VALID flag is not set in the tx descriptor, the hardware
      assigns the sequence number. However, the rest of the code assumes that the
      sequence number specified in the 802.11 header gets transmitted.
      This was causing issues with the aggregation setup, which worked for the
      initial one (where the sequence numbers were still close), but not for
      further teardown/re-establishing of sessions.
      
      Additionally, the overwrite of the TID sequence number in WTBL2 was resetting
      the hardware assigned sequence numbers, causing them to drift further apart.
      
      Fix this by using the software assigned sequence numbers
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      aa3cb24b