1. 03 Oct, 2018 2 commits
    • Sean Tranchetti's avatar
      net: qualcomm: rmnet: Skip processing loopback packets · a07f388e
      Sean Tranchetti authored
      RMNET RX handler was processing invalid packets that were
      originally sent on the real device and were looped back via
      dev_loopback_xmit(). This was detected using syzkaller.
      
      Fixes: ceed73a2 ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
      Signed-off-by: default avatarSean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a07f388e
    • Florian Fainelli's avatar
      net: systemport: Fix wake-up interrupt race during resume · 45ec3185
      Florian Fainelli authored
      The AON_PM_L2 is normally used to trigger and identify the source of a
      wake-up event. Since the RX_SYS clock is no longer turned off, we also
      have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and
      that interrupt remains active up until the magic packet detector is
      disabled which happens much later during the driver resumption.
      
      The race happens if we have a CPU that is entering the SYSTEMPORT
      INTRL2_0 handler during resume, and another CPU has managed to clear the
      wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we
      have the first CPU stuck in the interrupt handler with an interrupt
      cause that has been cleared under its feet, and so we keep returning
      IRQ_NONE and we never make any progress.
      
      This was not a problem before because we would always turn off the
      RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned
      off as well, thus not latching the interrupt.
      
      The fix is to make sure we do not enable either the MPD or
      BRCM_TAG_MATCH interrupts since those are redundant with what the
      AON_PM_L2 interrupt controller already processes and they would cause
      such a race to occur.
      
      Fixes: bb9051a2 ("net: systemport: Add support for WAKE_FILTER")
      Fixes: 83e82f4c ("net: systemport: add Wake-on-LAN support")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45ec3185
  2. 02 Oct, 2018 19 commits
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-for-davem-2018-10-01' of... · 11bde899
      David S. Miller authored
      Merge tag 'wireless-drivers-for-davem-2018-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for 4.19
      
      First, and also hopefully the last, set of fixes for 4.19. All small
      but still important fixes
      
      mt76x0
      
      * fix a bug when a virtual interface is removed multiple times
      
      b43
      
      * fix DMA error related regression with proprietary firmware
      
      iwlwifi
      
      * fix an oops which was a regression in v4.19-rc1
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11bde899
    • Eric Dumazet's avatar
      rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096 · 0e1d6eca
      Eric Dumazet authored
      We have an impressive number of syzkaller bugs that are linked
      to the fact that syzbot was able to create a networking device
      with millions of TX (or RX) queues.
      
      Let's limit the number of RX/TX queues to 4096, this really should
      cover all known cases.
      
      A separate patch will add various cond_resched() in the loops
      handling sysfs entries at device creation and dismantle.
      
      Tested:
      
      lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap
      RTNETLINK answers: Invalid argument
      
      lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap
      
      real	0m0.180s
      user	0m0.000s
      sys	0m0.107s
      
      Fixes: 76ff5cc9 ("rtnl: allow to specify number of rx and tx queues on device creation")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e1d6eca
    • Mahesh Bandewar's avatar
      bonding: fix warning message · 0f3b914c
      Mahesh Bandewar authored
      RX queue config for bonding master could be different from its slave
      device(s). With the commit 6a9e461f ("bonding: pass link-local
      packets to bonding master also."), the packet is reinjected into stack
      with skb->dev as bonding master. This potentially triggers the
      message:
      
         "bondX received packet on queue Y, but number of RX queues is Z"
      
      whenever the queue that packet is received on is higher than the
      numrxqueues on bonding master (Y > Z).
      
      Fixes: 6a9e461f ("bonding: pass link-local packets to bonding master also.")
      Reported-by: default avatarJohn Sperbeck <jsperbeck@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f3b914c
    • Eric Dumazet's avatar
      inet: make sure to grab rcu_read_lock before using ireq->ireq_opt · 2ab2ddd3
      Eric Dumazet authored
      Timer handlers do not imply rcu_read_lock(), so my recent fix
      triggered a LOCKDEP warning when SYNACK is retransmit.
      
      Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt
      usages instead of guessing what is done by callers, since it is
      not worth the pain.
      
      Get rid of ireq_opt_deref() helper since it hides the logic
      without real benefit, since it is now a standard rcu_dereference().
      
      Fixes: 1ad98e9d ("tcp/dccp: fix lockdep issue when SYN is backlogged")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ab2ddd3
    • Jakub Kicinski's avatar
      nfp: avoid soft lockups under control message storm · ff58e2df
      Jakub Kicinski authored
      When FW floods the driver with control messages try to exit the cmsg
      processing loop every now and then to avoid soft lockups.  Cmsg
      processing is generally very lightweight so 512 seems like a reasonable
      budget, which should not be exceeded under normal conditions.
      
      Fixes: 77ece8d5 ("nfp: add control vNIC datapath")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Tested-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff58e2df
    • Maciej W. Rozycki's avatar
      declance: Fix continuation with the adapter identification message · fe3a83af
      Maciej W. Rozycki authored
      Fix a commit 4bcc595c ("printk: reinstate KERN_CONT for printing
      continuation lines") regression with the `declance' driver, which caused
      the adapter identification message to be split between two lines, e.g.:
      
      declance.c: v0.011 by Linux MIPS DECstation task force
      tc6: PMAD-AA
      , addr = 08:00:2b:1b:2a:6a, irq = 14
      tc6: registered as eth0.
      
      Address that properly, by printing identification with a single call,
      making the messages now look like:
      
      declance.c: v0.011 by Linux MIPS DECstation task force
      tc6: PMAD-AA, addr = 08:00:2b:1b:2a:6a, irq = 14
      tc6: registered as eth0.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      Fixes: 4bcc595c ("printk: reinstate KERN_CONT for printing continuation lines")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe3a83af
    • Rickard x Andersson's avatar
      net: fec: fix rare tx timeout · 657ade07
      Rickard x Andersson authored
      During certain heavy network loads TX could time out
      with TX ring dump.
      TX is sometimes never restarted after reaching
      "tx_stop_threshold" because function "fec_enet_tx_queue"
      only tests the first queue.
      
      In addition the TX timeout callback function failed to
      recover because it also operated only on the first queue.
      Signed-off-by: default avatarRickard x Andersson <rickaran@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      657ade07
    • Heiner Kallweit's avatar
      r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO · ad5f97fa
      Heiner Kallweit authored
      Some of the chip-specific hw_start functions set bit TXCFG_AUTO_FIFO
      in register TxConfig. The original patch changed the order of some
      calls resulting in these changes being overwritten by
      rtl_set_tx_config_registers() in rtl_hw_start(). This eventually
      resulted in network stalls especially under high load.
      
      Analyzing the chip-specific hw_start functions all chip version from
      34, with the exception of version 39, need this bit set.
      This patch moves setting this bit to rtl_set_tx_config_registers().
      
      Fixes: 4fd48c4a ("r8169: move common initializations to tp->hw_start")
      Reported-by: default avatarOrtwin Glück <odi@odi.ch>
      Reported-by: default avatarDavid Arendt <admin@prnet.org>
      Root-caused-by: default avatarMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Tested-by: default avatarTony Atkinson <tatkinson@linux.com>
      Tested-by: default avatarDavid Arendt <admin@prnet.org>
      Tested-by: default avatarOrtwin Glück <odi@odi.ch>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad5f97fa
    • David S. Miller's avatar
      Merge branch 'tun-races' · 2547496e
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      tun: address two syzbot reports
      
      Small changes addressing races discovered by syzbot.
      
      First patch is a cleanup.
      Second patch moves a mutex init sooner.
      Third patch makes sure each tfile gets its own napi enable flags.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2547496e
    • Eric Dumazet's avatar
      tun: napi flags belong to tfile · af3fb24e
      Eric Dumazet authored
      Since tun->flags might be shared by multiple tfile structures,
      it is better to make sure tun_get_user() is using the flags
      for the current tfile.
      
      Presence of the READ_ONCE() in tun_napi_frags_enabled() gave a hint
      of what could happen, but we need something stronger to please
      syzbot.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 13647 Comm: syz-executor5 Not tainted 4.19.0-rc5+ #59
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427
      Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4
      RSP: 0018:ffff8801c400f410 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325
      RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0
      RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000
      R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358
      R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004
      FS:  00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       napi_gro_frags+0x3f4/0xc90 net/core/dev.c:5715
       tun_get_user+0x31d5/0x42a0 drivers/net/tun.c:1922
       tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1967
       call_write_iter include/linux/fs.h:1808 [inline]
       new_sync_write fs/read_write.c:474 [inline]
       __vfs_write+0x6b8/0x9f0 fs/read_write.c:487
       vfs_write+0x1fc/0x560 fs/read_write.c:549
       ksys_write+0x101/0x260 fs/read_write.c:598
       __do_sys_write fs/read_write.c:610 [inline]
       __se_sys_write fs/read_write.c:607 [inline]
       __x64_sys_write+0x73/0xb0 fs/read_write.c:607
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457579
      Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fe003614c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457579
      RDX: 0000000000000012 RSI: 0000000020000000 RDI: 000000000000000a
      RBP: 000000000072c040 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe0036156d4
      R13: 00000000004c5574 R14: 00000000004d8e98 R15: 00000000ffffffff
      Modules linked in:
      
      RIP: 0010:dev_gro_receive+0x132/0x2720 net/core/dev.c:5427
      Code: 48 c1 ea 03 80 3c 02 00 0f 85 6e 20 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 6e 10 49 8d bd d0 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 59 20 00 00 4d 8b a5 d0 00 00 00 31 ff 41 81 e4
      RSP: 0018:ffff8801c400f410 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8618d325
      RDX: 000000000000001a RSI: ffffffff86189f97 RDI: 00000000000000d0
      RBP: ffff8801c400f608 R08: ffff8801c8fb4300 R09: 0000000000000000
      R10: ffffed0038801ed7 R11: 0000000000000003 R12: ffff8801d327d358
      R13: 0000000000000000 R14: ffff8801c16dd8c0 R15: 0000000000000004
      FS:  00007fe003615700(0000) GS:ffff8801dac00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fe1f3c43db8 CR3: 00000001bebb2000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af3fb24e
    • Eric Dumazet's avatar
      tun: initialize napi_mutex unconditionally · c7256f57
      Eric Dumazet authored
      This is the first part to fix following syzbot report :
      
      console output: https://syzkaller.appspot.com/x/log.txt?x=145378e6400000
      kernel config:  https://syzkaller.appspot.com/x/.config?x=443816db871edd66
      dashboard link: https://syzkaller.appspot.com/bug?extid=e662df0ac1d753b57e80
      
      Following patch is fixing the race condition, but it seems safer
      to initialize this mutex at tfile creation anyway.
      
      Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: syzbot+e662df0ac1d753b57e80@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7256f57
    • Eric Dumazet's avatar
      tun: remove unused parameters · 06e55add
      Eric Dumazet authored
      tun_napi_disable() and tun_napi_del() do not need
      a pointer to the tun_struct
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06e55add
    • Dave Jones's avatar
      bond: take rcu lock in netpoll_send_skb_on_dev · 6fe94878
      Dave Jones authored
      The bonding driver lacks the rcu lock when it calls down into
      netdev_lower_get_next_private_rcu from bond_poll_controller, which
      results in a trace like:
      
      WARNING: CPU: 2 PID: 179 at net/core/dev.c:6567 netdev_lower_get_next_private_rcu+0x34/0x40
      CPU: 2 PID: 179 Comm: kworker/u16:15 Not tainted 4.19.0-rc5-backup+ #1
      Workqueue: bond0 bond_mii_monitor
      RIP: 0010:netdev_lower_get_next_private_rcu+0x34/0x40
      Code: 48 89 fb e8 fe 29 63 ff 85 c0 74 1e 48 8b 45 00 48 81 c3 c0 00 00 00 48 8b 00 48 39 d8 74 0f 48 89 45 00 48 8b 40 f8 5b 5d c3 <0f> 0b eb de 31 c0 eb f5 0f 1f 40 00 0f 1f 44 00 00 48 8>
      RSP: 0018:ffffc9000087fa68 EFLAGS: 00010046
      RAX: 0000000000000000 RBX: ffff880429614560 RCX: 0000000000000000
      RDX: 0000000000000001 RSI: 00000000ffffffff RDI: ffffffffa184ada0
      RBP: ffffc9000087fa80 R08: 0000000000000001 R09: 0000000000000000
      R10: ffffc9000087f9f0 R11: ffff880429798040 R12: ffff8804289d5980
      R13: ffffffffa1511f60 R14: 00000000000000c8 R15: 00000000ffffffff
      FS:  0000000000000000(0000) GS:ffff88042f880000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f4b78fce180 CR3: 000000018180f006 CR4: 00000000001606e0
      Call Trace:
       bond_poll_controller+0x52/0x170
       netpoll_poll_dev+0x79/0x290
       netpoll_send_skb_on_dev+0x158/0x2c0
       netpoll_send_udp+0x2d5/0x430
       write_ext_msg+0x1e0/0x210
       console_unlock+0x3c4/0x630
       vprintk_emit+0xfa/0x2f0
       printk+0x52/0x6e
       ? __netdev_printk+0x12b/0x220
       netdev_info+0x64/0x80
       ? bond_3ad_set_carrier+0xe9/0x180
       bond_select_active_slave+0x1fc/0x310
       bond_mii_monitor+0x709/0x9b0
       process_one_work+0x221/0x5e0
       worker_thread+0x4f/0x3b0
       kthread+0x100/0x140
       ? process_one_work+0x5e0/0x5e0
       ? kthread_delayed_work_timer_fn+0x90/0x90
       ret_from_fork+0x24/0x30
      
      We're also doing rcu dereferences a layer up in netpoll_send_skb_on_dev
      before we call down into netpoll_poll_dev, so just take the lock there.
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fe94878
    • David Ahern's avatar
      rtnetlink: Fail dump if target netnsid is invalid · 893626d6
      David Ahern authored
      Link dumps can return results from a target namespace. If the namespace id
      is invalid, then the dump request should fail if get_target_net fails
      rather than continuing with a dump of the current namespace.
      
      Fixes: 79e1ad14 ("rtnetlink: use netnsid to query interface")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      893626d6
    • Flavio Leitner's avatar
      Revert "openvswitch: Fix template leak in error cases." · 7f6d6558
      Flavio Leitner authored
      This reverts commit 90c7afc9.
      
      When the commit was merged, the code used nf_ct_put() to free
      the entry, but later on commit 76644232 ("openvswitch: Free
      tmpl with tmpl_free.") replaced that with nf_ct_tmpl_free which
      is a more appropriate. Now the original problem is removed.
      
      Then 44d6e2f2 ("net: Replace NF_CT_ASSERT() with WARN_ON().")
      replaced a debug assert with a WARN_ON() which is trigged now.
      Signed-off-by: default avatarFlavio Leitner <fbl@redhat.com>
      Acked-by: default avatarJoe Stringer <joe@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f6d6558
    • David S. Miller's avatar
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 92d7c74b
      David S. Miller authored
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth 2018-09-27
      
      Here's one more Bluetooth fix for 4.19, fixing the handling of an
      attempt to unpair a device while pairing is in progress.
      
      Let me know if there are any issues pulling. Thanks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92d7c74b
    • LUU Duc Canh's avatar
      tipc: ignore STATE_MSG on wrong link session · d949cfed
      LUU Duc Canh authored
      The initial session number when a link is created is based on a random
      value, taken from struct tipc_net->random. It is then incremented for
      each link reset to avoid mixing protocol messages from different link
      sessions.
      
      However, when a bearer is reset all its links are deleted, and will
      later be re-created using the same random value as the first time.
      This means that if the link never went down between creation and
      deletion we will still sometimes have two subsequent sessions with
      the same session number. In virtual environments with potentially
      long transmission times this has turned out to be a real problem.
      
      We now fix this by randomizing the session number each time a link
      is created.
      
      With a session number size of 16 bits this gives a risk of session
      collision of 1/64k. To reduce this further, we also introduce a sanity
      check on the very first STATE message arriving at a link. If this has
      an acknowledge value differing from 0, which is logically impossible,
      we ignore the message. The final risk for session collision is hence
      reduced to 1/4G, which should be sufficient.
      Signed-off-by: default avatarLUU Duc Canh <canh.d.luu@dektech.com.au>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d949cfed
    • Dan Carpenter's avatar
      net: sched: act_ipt: check for underflow in __tcf_ipt_init() · aeadd93f
      Dan Carpenter authored
      If "td->u.target_size" is larger than sizeof(struct xt_entry_target) we
      return -EINVAL.  But we don't check whether it's smaller than
      sizeof(struct xt_entry_target) and that could lead to an out of bounds
      read.
      
      Fixes: 7ba699c6 ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aeadd93f
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · ee0b6f48
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2018-10-01
      
      1) Validate address prefix lengths in the xfrm selector,
         otherwise we may hit undefined behaviour in the
         address matching functions if the prefix is too
         big for the given address family.
      
      2) Fix skb leak on local message size errors.
         From Thadeu Lima de Souza Cascardo.
      
      3) We currently reset the transport header back to the network
         header after a transport mode transformation is applied. This
         leads to an incorrect transport header when multiple transport
         mode transformations are applied. Reset the transport header
         only after all transformations are already applied to fix this.
         From Sowmini Varadhan.
      
      4) We only support one offloaded xfrm, so reset crypto_done after
         the first transformation in xfrm_input(). Otherwise we may call
         the wrong input method for subsequent transformations.
         From Sowmini Varadhan.
      
      5) Fix NULL pointer dereference when skb_dst_force clears the dst_entry.
         skb_dst_force does not really force a dst refcount anymore, it might
         clear it instead. xfrm code did not expect this, add a check to not
         dereference skb_dst() if it was cleared by skb_dst_force.
      
      6) Validate xfrm template mode, otherwise we can get a stack-out-of-bounds
         read in xfrm_state_find. From Sean Tranchetti.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee0b6f48
  3. 01 Oct, 2018 2 commits
    • Eric Dumazet's avatar
      tcp/dccp: fix lockdep issue when SYN is backlogged · 1ad98e9d
      Eric Dumazet authored
      In normal SYN processing, packets are handled without listener
      lock and in RCU protected ingress path.
      
      But syzkaller is known to be able to trick us and SYN
      packets might be processed in process context, after being
      queued into socket backlog.
      
      In commit 06f877d6 ("tcp/dccp: fix other lockdep splats
      accessing ireq_opt") I made a very stupid fix, that happened
      to work mostly because of the regular path being RCU protected.
      
      Really the thing protecting ireq->ireq_opt is RCU read lock,
      and the pseudo request refcnt is not relevant.
      
      This patch extends what I did in commit 449809a6 ("tcp/dccp:
      block BH for SYN processing") by adding an extra rcu_read_{lock|unlock}
      pair in the paths that might be taken when processing SYN from
      socket backlog (thus possibly in process context)
      
      Fixes: 06f877d6 ("tcp/dccp: fix other lockdep splats accessing ireq_opt")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ad98e9d
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · c8424ddd
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for your net tree:
      
      1) Skip ip_sabotage_in() for packet making into the VRF driver,
         otherwise packets are dropped, from David Ahern.
      
      2) Clang compilation warning uncovering typo in the
         nft_validate_register_store() call from nft_osf, from Stefan Agner.
      
      3) Double sizeof netlink message length calculations in ctnetlink,
         from zhong jiang.
      
      4) Missing rb_erase() on batch full in rbtree garbage collector,
         from Taehee Yoo.
      
      5) Calm down compilation warning in nf_hook(), from Florian Westphal.
      
      6) Missing check for non-null sk in xt_socket before validating
         netns procedence, from Flavio Leitner.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8424ddd
  4. 29 Sep, 2018 14 commits
  5. 28 Sep, 2018 3 commits
    • David S. Miller's avatar
      Merge branch 'netpoll-second-round-of-fixes' · f13d1b48
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      netpoll: second round of fixes.
      
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC).
      
      This capture, showing one ksoftirqd eating all cycles
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      It seems that all networking drivers that do use NAPI
      for their TX completions, should not provide a ndo_poll_controller() :
      
      Most NAPI drivers have netpoll support already handled
      in core networking stack, since netpoll_poll_dev()
      uses poll_napi(dev) to iterate through registered
      NAPI contexts for a device.
      
      First patch is a fix in poll_one_napi().
      
      Then following patches take care of ten drivers.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f13d1b48
    • Eric Dumazet's avatar
      ibmvnic: remove ndo_poll_controller · 0c3b9d1b
      Eric Dumazet authored
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      ibmvnic uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      
      ibmvnic_netpoll_controller() was completely wrong anyway,
      as it was scheduling NAPI to service RX queues (instead of TX),
      so I doubt netpoll ever worked on this driver.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Cc: John Allen <jallen@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c3b9d1b
    • Eric Dumazet's avatar
      sfc-falcon: remove ndo_poll_controller · a4f570be
      Eric Dumazet authored
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      sfc-falcon uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
      Cc: Edward Cree <ecree@solarflare.com>
      Cc: Bert Kenward <bkenward@solarflare.com>
      Acked-By: default avatarBert Kenward <bkenward@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4f570be