1. 06 Jul, 2020 8 commits
  2. 05 Jul, 2020 14 commits
    • David S. Miller's avatar
      Merge branch 'net-rmnet-fix-interface-leak-for-rmnet-module' · 0f57a1e5
      David S. Miller authored
      Taehee Yoo says:
      
      ====================
      net: rmnet: fix interface leak for rmnet module
      
      There are two problems in rmnet module that they occur the leak of
      a lower interface.
      The symptom is the same, which is the leak of a lower interface.
      But there are two different real problems.
      This patchset is to fix these real problems.
      
      1. Do not allow to have different two modes.
      As a lower interface of rmnet, there are two modes that they are VND
      and BRIDGE.
      One interface can have only one mode.
      But in the current rmnet, there is no code to prevent to have
      two modes in one lower interface.
      So, interface leak occurs.
      
      2. Do not allow to add multiple bridge interfaces.
      rmnet can have only two bridge interface.
      If an additional bridge interface is tried to be attached,
      rmnet should deny it.
      But there is no code to do that.
      So, interface leak occurs.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f57a1e5
    • Taehee Yoo's avatar
      net: rmnet: do not allow to add multiple bridge interfaces · 2fb2799a
      Taehee Yoo authored
      rmnet can have only two bridge interface.
      One of them is a link interface and another one is added by
      the master operation.
      rmnet interface shouldn't allow adding additional
      bridge interfaces by mater operation.
      But, there is no code to deny additional interfaces.
      So, interface leak occurs.
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add dummy1 type dummy
          ip link add dummy2 type dummy
          ip link add rmnet0 link dummy0 type rmnet mux_id 1
          ip link set dummy1 master rmnet0
          ip link set dummy2 master rmnet0
          ip link del rmnet0
      
      In the above test command, the dummy0 was attached to rmnet as VND mode.
      Then, dummy1 was attached to rmnet0 as BRIDGE mode.
      At this point, dummy0 mode is switched from VND to BRIDGE automatically.
      Then, dummy2 is attached to rmnet as BRIDGE mode.
      At this point, rmnet0 should deny this operation.
      But, rmnet0 doesn't deny this.
      So that below splat occurs when the rmnet0 interface is deleted.
      
      Splat looks like:
      [  186.684787][    C2] WARNING: CPU: 2 PID: 1009 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
      [  186.684788][    C2] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_x
      [  186.684805][    C2] CPU: 2 PID: 1009 Comm: ip Not tainted 5.8.0-rc1+ #621
      [  186.684807][    C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  186.684808][    C2] RIP: 0010:rollback_registered_many+0x986/0xcf0
      [  186.684811][    C2] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 5
      [  186.684812][    C2] RSP: 0018:ffff8880cd9472e0 EFLAGS: 00010287
      [  186.684815][    C2] RAX: ffff8880cc56da58 RBX: ffff8880ab21c000 RCX: ffffffff9329d323
      [  186.684816][    C2] RDX: 1ffffffff2be6410 RSI: 0000000000000008 RDI: ffffffff95f32080
      [  186.684818][    C2] RBP: dffffc0000000000 R08: fffffbfff2be6411 R09: fffffbfff2be6411
      [  186.684819][    C2] R10: ffffffff95f32087 R11: 0000000000000001 R12: ffff8880cd947480
      [  186.684820][    C2] R13: ffff8880ab21c0b8 R14: ffff8880cd947400 R15: ffff8880cdf10640
      [  186.684822][    C2] FS:  00007f00843890c0(0000) GS:ffff8880d4e00000(0000) knlGS:0000000000000000
      [  186.684823][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  186.684825][    C2] CR2: 000055b8ab1077b8 CR3: 00000000ab612006 CR4: 00000000000606e0
      [  186.684826][    C2] Call Trace:
      [  186.684827][    C2]  ? lockdep_hardirqs_on_prepare+0x379/0x540
      [  186.684829][    C2]  ? netif_set_real_num_tx_queues+0x780/0x780
      [  186.684830][    C2]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
      [  186.684831][    C2]  ? __kasan_slab_free+0x126/0x150
      [  186.684832][    C2]  ? kfree+0xdc/0x320
      [  186.684834][    C2]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
      [  186.684835][    C2]  unregister_netdevice_many.part.135+0x13/0x1b0
      [  186.684836][    C2]  rtnl_delete_link+0xbc/0x100
      [ ... ]
      [  238.440071][ T1009] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1
      
      Fixes: 037f9cdf ("net: rmnet: use upper/lower device infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2fb2799a
    • Taehee Yoo's avatar
      net: rmnet: fix lower interface leak · 2a762e9e
      Taehee Yoo authored
      There are two types of the lower interface of rmnet that are VND
      and BRIDGE.
      Each lower interface can have only one type either VND or BRIDGE.
      But, there is a case, which uses both lower interface types.
      Due to this unexpected behavior, lower interface leak occurs.
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add dummy1 type dummy
          ip link add rmnet0 link dummy0 type rmnet mux_id 1
          ip link set dummy1 master rmnet0
          ip link add rmnet1 link dummy1 type rmnet mux_id 2
          ip link del rmnet0
      
      The dummy1 was attached as BRIDGE interface of rmnet0.
      Then, it also was attached as VND interface of rmnet1.
      This is unexpected behavior and there is no code for handling this case.
      So that below splat occurs when the rmnet0 interface is deleted.
      
      Splat looks like:
      [   53.254112][    C1] WARNING: CPU: 1 PID: 1192 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
      [   53.254117][    C1] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nfx
      [   53.254182][    C1] CPU: 1 PID: 1192 Comm: ip Not tainted 5.8.0-rc1+ #620
      [   53.254188][    C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   53.254192][    C1] RIP: 0010:rollback_registered_many+0x986/0xcf0
      [   53.254200][    C1] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 0f 0b e5
      [   53.254205][    C1] RSP: 0018:ffff888050a5f2e0 EFLAGS: 00010287
      [   53.254214][    C1] RAX: ffff88805756d658 RBX: ffff88804d99c000 RCX: ffffffff8329d323
      [   53.254219][    C1] RDX: 1ffffffff0be6410 RSI: 0000000000000008 RDI: ffffffff85f32080
      [   53.254223][    C1] RBP: dffffc0000000000 R08: fffffbfff0be6411 R09: fffffbfff0be6411
      [   53.254228][    C1] R10: ffffffff85f32087 R11: 0000000000000001 R12: ffff888050a5f480
      [   53.254233][    C1] R13: ffff88804d99c0b8 R14: ffff888050a5f400 R15: ffff8880548ebe40
      [   53.254238][    C1] FS:  00007f6b86b370c0(0000) GS:ffff88806c200000(0000) knlGS:0000000000000000
      [   53.254243][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   53.254248][    C1] CR2: 0000562c62438758 CR3: 000000003f600005 CR4: 00000000000606e0
      [   53.254253][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   53.254257][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   53.254261][    C1] Call Trace:
      [   53.254266][    C1]  ? lockdep_hardirqs_on_prepare+0x379/0x540
      [   53.254270][    C1]  ? netif_set_real_num_tx_queues+0x780/0x780
      [   53.254275][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
      [   53.254279][    C1]  ? __kasan_slab_free+0x126/0x150
      [   53.254283][    C1]  ? kfree+0xdc/0x320
      [   53.254288][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
      [   53.254293][    C1]  unregister_netdevice_many.part.135+0x13/0x1b0
      [   53.254297][    C1]  rtnl_delete_link+0xbc/0x100
      [   53.254301][    C1]  ? rtnl_af_register+0xc0/0xc0
      [   53.254305][    C1]  rtnl_dellink+0x2dc/0x840
      [   53.254309][    C1]  ? find_held_lock+0x39/0x1d0
      [   53.254314][    C1]  ? valid_fdb_dump_strict+0x620/0x620
      [   53.254318][    C1]  ? rtnetlink_rcv_msg+0x457/0x890
      [   53.254322][    C1]  ? lock_contended+0xd20/0xd20
      [   53.254326][    C1]  rtnetlink_rcv_msg+0x4a8/0x890
      [ ... ]
      [   73.813696][ T1192] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1
      
      Fixes: 037f9cdf ("net: rmnet: use upper/lower device infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a762e9e
    • Taehee Yoo's avatar
      hsr: fix interface leak in error path of hsr_dev_finalize() · ccfc9df1
      Taehee Yoo authored
      To release hsr(upper) interface, it should release
      its own lower interfaces first.
      Then, hsr(upper) interface can be released safely.
      In the current code of error path of hsr_dev_finalize(), it releases hsr
      interface before releasing a lower interface.
      So, a warning occurs, which warns about the leak of lower interfaces.
      In order to fix this problem, changing the ordering of the error path of
      hsr_dev_finalize() is needed.
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add dummy1 type dummy
          ip link add dummy2 type dummy
          ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
          ip link add hsr1 type hsr slave1 dummy2 slave2 dummy0
      
      Splat looks like:
      [  214.923127][    C2] WARNING: CPU: 2 PID: 1093 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
      [  214.923129][    C2] Modules linked in: hsr dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipx
      [  214.923154][    C2] CPU: 2 PID: 1093 Comm: ip Not tainted 5.8.0-rc2+ #623
      [  214.923156][    C2] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  214.923157][    C2] RIP: 0010:rollback_registered_many+0x986/0xcf0
      [  214.923160][    C2] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 5
      [  214.923162][    C2] RSP: 0018:ffff8880c5156f28 EFLAGS: 00010287
      [  214.923165][    C2] RAX: ffff8880d1dad458 RBX: ffff8880bd1b9000 RCX: ffffffffb929d243
      [  214.923167][    C2] RDX: 1ffffffff77e63f0 RSI: 0000000000000008 RDI: ffffffffbbf31f80
      [  214.923168][    C2] RBP: dffffc0000000000 R08: fffffbfff77e63f1 R09: fffffbfff77e63f1
      [  214.923170][    C2] R10: ffffffffbbf31f87 R11: 0000000000000001 R12: ffff8880c51570a0
      [  214.923172][    C2] R13: ffff8880bd1b90b8 R14: ffff8880c5157048 R15: ffff8880d1dacc40
      [  214.923174][    C2] FS:  00007fdd257a20c0(0000) GS:ffff8880da200000(0000) knlGS:0000000000000000
      [  214.923175][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  214.923177][    C2] CR2: 00007ffd78beb038 CR3: 00000000be544005 CR4: 00000000000606e0
      [  214.923179][    C2] Call Trace:
      [  214.923180][    C2]  ? netif_set_real_num_tx_queues+0x780/0x780
      [  214.923182][    C2]  ? dev_validate_mtu+0x140/0x140
      [  214.923183][    C2]  ? synchronize_rcu.part.79+0x85/0xd0
      [  214.923185][    C2]  ? synchronize_rcu_expedited+0xbb0/0xbb0
      [  214.923187][    C2]  rollback_registered+0xc8/0x170
      [  214.923188][    C2]  ? rollback_registered_many+0xcf0/0xcf0
      [  214.923190][    C2]  unregister_netdevice_queue+0x18b/0x240
      [  214.923191][    C2]  hsr_dev_finalize+0x56e/0x6e0 [hsr]
      [  214.923192][    C2]  hsr_newlink+0x36b/0x450 [hsr]
      [  214.923194][    C2]  ? hsr_dellink+0x70/0x70 [hsr]
      [  214.923195][    C2]  ? rtnl_create_link+0x2e4/0xb00
      [  214.923197][    C2]  ? __netlink_ns_capable+0xc3/0xf0
      [  214.923198][    C2]  __rtnl_newlink+0xbdb/0x1270
      [ ... ]
      
      Fixes: e0a4b997 ("hsr: use upper/lower device infrastructure")
      Reported-by: syzbot+7f1c020f68dab95aab59@syzkaller.appspotmail.com
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccfc9df1
    • Luo bin's avatar
      hinic: fix sending mailbox timeout in aeq event work · 6dbb8901
      Luo bin authored
      When sending mailbox in the work of aeq event, another aeq event
      will be triggered. because the last aeq work is not exited and only
      one work can be excuted simultaneously in the same workqueue, mailbox
      sending function will return failure of timeout. We create and use
      another workqueue to fix this.
      Signed-off-by: default avatarLuo bin <luobin9@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6dbb8901
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · c00e858d
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Use kvfree() to release vmalloc()'ed areas in ipset, from Eric Dumazet.
      
      2) UAF in nfnetlink_queue from the nf_conntrack_update() path.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c00e858d
    • David S. Miller's avatar
      Merge branch 'Documentation-networking-eliminate-doubled-words' · 4d572545
      David S. Miller authored
      Randy Dunlap says:
      
      ====================
      Documentation: networking: eliminate doubled words
      
      Drop all duplicated words in Documentation/networking/ files.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d572545
    • Randy Dunlap's avatar
      Documentation: networking: rxrpc: drop doubled word · e54ac95a
      Randy Dunlap authored
      Drop the doubled word "have".
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: David Howells <dhowells@redhat.com>
      Cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e54ac95a
    • Randy Dunlap's avatar
      Documentation: networking: ipvs-sysctl: drop doubled word · 474112d5
      Randy Dunlap authored
      Drop the doubled word "that".
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      474112d5
    • Randy Dunlap's avatar
      Documentation: networking: ip-sysctl: drop doubled word · a7db3c76
      Randy Dunlap authored
      Drop the doubled word "that".
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7db3c76
    • Randy Dunlap's avatar
      Documentation: networking: dsa: drop doubled word · 4f6a009c
      Randy Dunlap authored
      Drop the doubled word "in".
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f6a009c
    • Randy Dunlap's avatar
      Documentation: networking: can_ucan_protocol: drop doubled words · 6d0fe3ae
      Randy Dunlap authored
      Drop the doubled words "the" and "of".
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: Wolfgang Grandegger <wg@grandegger.com>
      Cc: Marc Kleine-Budde <mkl@pengutronix.de>
      Cc: linux-can@vger.kernel.org
      Acked-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d0fe3ae
    • Randy Dunlap's avatar
      Documentation: networking: ax25: drop doubled word · e9909485
      Randy Dunlap authored
      Drop the doubled word "and".
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-hams@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9909485
    • Randy Dunlap's avatar
      Documentation: networking: arcnet: drop doubled word · caebecb0
      Randy Dunlap authored
      Drop the doubled word "to".
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      caebecb0
  3. 03 Jul, 2020 2 commits
    • Toke Høiland-Jørgensen's avatar
      sched: consistently handle layer3 header accesses in the presence of VLANs · d7bf2ebe
      Toke Høiland-Jørgensen authored
      There are a couple of places in net/sched/ that check skb->protocol and act
      on the value there. However, in the presence of VLAN tags, the value stored
      in skb->protocol can be inconsistent based on whether VLAN acceleration is
      enabled. The commit quoted in the Fixes tag below fixed the users of
      skb->protocol to use a helper that will always see the VLAN ethertype.
      
      However, most of the callers don't actually handle the VLAN ethertype, but
      expect to find the IP header type in the protocol field. This means that
      things like changing the ECN field, or parsing diffserv values, stops
      working if there's a VLAN tag, or if there are multiple nested VLAN
      tags (QinQ).
      
      To fix this, change the helper to take an argument that indicates whether
      the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
      make sure to skip all of them, so behaviour is consistent even in QinQ
      mode.
      
      To make the helper usable from the ECN code, move it to if_vlan.h instead
      of pkt_sched.h.
      
      v3:
      - Remove empty lines
      - Move vlan variable definitions inside loop in skb_protocol()
      - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
        bpf_skb_ecn_set_ce()
      
      v2:
      - Use eth_type_vlan() helper in skb_protocol()
      - Also fix code that reads skb->protocol directly
      - Change a couple of 'if/else if' statements to switch constructs to avoid
        calling the helper twice
      Reported-by: default avatarIlya Ponetayev <i.ponetaev@ndmsystems.com>
      Fixes: d8b9605d ("net: sched: fix skb->protocol use in case of accelerated vlan path")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7bf2ebe
    • Pablo Neira Ayuso's avatar
      netfilter: conntrack: refetch conntrack after nf_conntrack_update() · d005fbb8
      Pablo Neira Ayuso authored
      __nf_conntrack_update() might refresh the conntrack object that is
      attached to the skbuff. Otherwise, this triggers UAF.
      
      [  633.200434] ==================================================================
      [  633.200472] BUG: KASAN: use-after-free in nf_conntrack_update+0x34e/0x770 [nf_conntrack]
      [  633.200478] Read of size 1 at addr ffff888370804c00 by task nfqnl_test/6769
      
      [  633.200487] CPU: 1 PID: 6769 Comm: nfqnl_test Not tainted 5.8.0-rc2+ #388
      [  633.200490] Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
      [  633.200491] Call Trace:
      [  633.200499]  dump_stack+0x7c/0xb0
      [  633.200526]  ? nf_conntrack_update+0x34e/0x770 [nf_conntrack]
      [  633.200532]  print_address_description.constprop.6+0x1a/0x200
      [  633.200539]  ? _raw_write_lock_irqsave+0xc0/0xc0
      [  633.200568]  ? nf_conntrack_update+0x34e/0x770 [nf_conntrack]
      [  633.200594]  ? nf_conntrack_update+0x34e/0x770 [nf_conntrack]
      [  633.200598]  kasan_report.cold.9+0x1f/0x42
      [  633.200604]  ? call_rcu+0x2c0/0x390
      [  633.200633]  ? nf_conntrack_update+0x34e/0x770 [nf_conntrack]
      [  633.200659]  nf_conntrack_update+0x34e/0x770 [nf_conntrack]
      [  633.200687]  ? nf_conntrack_find_get+0x30/0x30 [nf_conntrack]
      
      Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1436
      Fixes: ee04805f ("netfilter: conntrack: make conntrack userspace helpers work again")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      d005fbb8
  4. 02 Jul, 2020 9 commits
    • Nicolas Ferre's avatar
      MAINTAINERS: net: macb: add Claudiu as co-maintainer · ad4e2b64
      Nicolas Ferre authored
      I would like that Claudiu becomes co-maintainer of the Cadence macb
      driver. He's already participating to lots of reviews and enhancements
      to this driver and knows the different versions of this controller.
      Signed-off-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad4e2b64
    • Codrin Ciubotariu's avatar
      net: dsa: microchip: set the correct number of ports · af199a1a
      Codrin Ciubotariu authored
      The number of ports is incorrectly set to the maximum available for a DSA
      switch. Even if the extra ports are not used, this causes some functions
      to be called later, like port_disable() and port_stp_state_set(). If the
      driver doesn't check the port index, it will end up modifying unknown
      registers.
      
      Fixes: b987e98e ("dsa: add DSA switch driver for Microchip KSZ9477")
      Signed-off-by: default avatarCodrin Ciubotariu <codrin.ciubotariu@microchip.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af199a1a
    • Eric Dumazet's avatar
      tcp: md5: allow changing MD5 keys in all socket states · 1ca0fafd
      Eric Dumazet authored
      This essentially reverts commit 72123032 ("tcp: md5: reject TCP_MD5SIG
      or TCP_MD5SIG_EXT on established sockets")
      
      Mathieu reported that many vendors BGP implementations can
      actually switch TCP MD5 on established flows.
      
      Quoting Mathieu :
         Here is a list of a few network vendors along with their behavior
         with respect to TCP MD5:
      
         - Cisco: Allows for password to be changed, but within the hold-down
           timer (~180 seconds).
         - Juniper: When password is initially set on active connection it will
           reset, but after that any subsequent password changes no network
           resets.
         - Nokia: No notes on if they flap the tcp connection or not.
         - Ericsson/RedBack: Allows for 2 password (old/new) to co-exist until
           both sides are ok with new passwords.
         - Meta-Switch: Expects the password to be set before a connection is
           attempted, but no further info on whether they reset the TCP
           connection on a change.
         - Avaya: Disable the neighbor, then set password, then re-enable.
         - Zebos: Would normally allow the change when socket connected.
      
      We can revert my prior change because commit 9424e2e7 ("tcp: md5: fix potential
      overestimation of TCP option space") removed the leak of 4 kernel bytes to
      the wire that was the main reason for my patch.
      
      While doing my investigations, I found a bug when a MD5 key is changed, leading
      to these commits that stable teams want to consider before backporting this revert :
      
       Commit 6a2febec ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
       Commit e6ced831 ("tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers")
      
      Fixes: 72123032 "tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets"
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ca0fafd
    • Helmut Grohne's avatar
      net: dsa: microchip: enable ksz9893 via i2c in the ksz9477 driver · e4b9a72d
      Helmut Grohne authored
      The KSZ9893 3-Port Gigabit Ethernet Switch can be controlled via SPI,
      I²C or MDIO (very limited and not supported by this driver). While there
      is already a compatible entry for the SPI bus, it was missing for I²C.
      Signed-off-by: default avatarHelmut Grohne <helmut.grohne@intenta.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4b9a72d
    • Eric Dumazet's avatar
      tcp: fix SO_RCVLOWAT possible hangs under high mem pressure · ba3bb0e7
      Eric Dumazet authored
      Whenever tcp_try_rmem_schedule() returns an error, we are under
      trouble and should make sure to wakeup readers so that they
      can drain socket queues and eventually make room.
      
      Fixes: 03f45c88 ("tcp: avoid extra wakeups for SO_RCVLOWAT users")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba3bb0e7
    • Willem de Bruijn's avatar
      ip: Fix SO_MARK in RST, ACK and ICMP packets · 0da7536f
      Willem de Bruijn authored
      When no full socket is available, skbs are sent over a per-netns
      control socket. Its sk_mark is temporarily adjusted to match that
      of the real (request or timewait) socket or to reflect an incoming
      skb, so that the outgoing skb inherits this in __ip_make_skb.
      
      Introduction of the socket cookie mark field broke this. Now the
      skb is set through the cookie and cork:
      
      <caller>		# init sockc.mark from sk_mark or cmsg
      ip_append_data
        ip_setup_cork		# convert sockc.mark to cork mark
      ip_push_pending_frames
        ip_finish_skb
          __ip_make_skb	# set skb->mark to cork mark
      
      But I missed these special control sockets. Update all callers of
      __ip(6)_make_skb that were originally missed.
      
      For IPv6, the same two icmp(v6) paths are affected. The third
      case is not, as commit 92e55f41 ("tcp: don't annotate
      mark on control socket from tcp_v6_send_response()") replaced
      the ctl_sk->sk_mark with passing the mark field directly as a
      function argument. That commit predates the commit that
      introduced the bug.
      
      Fixes: c6af0c22 ("ip: support SO_MARK cmsg")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reported-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reviewed-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0da7536f
    • Eric Dumazet's avatar
      tcp: md5: do not send silly options in SYNCOOKIES · e114e1e8
      Eric Dumazet authored
      Whenever cookie_init_timestamp() has been used to encode
      ECN,SACK,WSCALE options, we can not remove the TS option in the SYNACK.
      
      Otherwise, tcp_synack_options() will still advertize options like WSCALE
      that we can not deduce later when receiving the packet from the client
      to complete 3WHS.
      
      Note that modern linux TCP stacks wont use MD5+TS+SACK in a SYN packet,
      but we can not know for sure that all TCP stacks have the same logic.
      
      Before the fix a tcpdump would exhibit this wrong exchange :
      
      10:12:15.464591 IP C > S: Flags [S], seq 4202415601, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 456965269 ecr 0,nop,wscale 8], length 0
      10:12:15.464602 IP S > C: Flags [S.], seq 253516766, ack 4202415602, win 65535, options [nop,nop,md5 valid,mss 1400,nop,nop,sackOK,nop,wscale 8], length 0
      10:12:15.464611 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid], length 0
      10:12:15.464678 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid], length 12
      10:12:15.464685 IP S > C: Flags [.], ack 13, win 65535, options [nop,nop,md5 valid], length 0
      
      After this patch the exchange looks saner :
      
      11:59:59.882990 IP C > S: Flags [S], seq 517075944, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508483 ecr 0,nop,wscale 8], length 0
      11:59:59.883002 IP S > C: Flags [S.], seq 1902939253, ack 517075945, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508479 ecr 1751508483,nop,wscale 8], length 0
      11:59:59.883012 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 0
      11:59:59.883114 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 12
      11:59:59.883122 IP S > C: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508483], length 0
      11:59:59.883152 IP S > C: Flags [P.], seq 1:13, ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508483], length 12
      11:59:59.883170 IP C > S: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508484], length 0
      
      Of course, no SACK block will ever be added later, but nothing should break.
      Technically, we could remove the 4 nops included in MD5+TS options,
      but again some stacks could break seeing not conventional alignment.
      
      Fixes: 4957faad ("TCPCT part 1g: Responder Cookie => Initiator")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e114e1e8
    • Rao Shoaib's avatar
      rds: If one path needs re-connection, check all and re-connect · 9ef845f8
      Rao Shoaib authored
      In testing with mprds enabled, Oracle Cluster nodes after reboot were
      not able to communicate with others nodes and so failed to rejoin
      the cluster. Peers with lower IP address initiated connection but the
      node could not respond as it choose a different path and could not
      initiate a connection as it had a higher IP address.
      
      With this patch, when a node sends out a packet and the selected path
      is down, all other paths are also checked and any down paths are
      re-connected.
      Reviewed-by: default avatarKa-cheong Poon <ka-cheong.poon@oracle.com>
      Reviewed-by: default avatarDavid Edmondson <david.edmondson@oracle.com>
      Signed-off-by: default avatarSomasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
      Signed-off-by: default avatarRao Shoaib <rao.shoaib@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ef845f8
    • Eric Dumazet's avatar
      tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers · e6ced831
      Eric Dumazet authored
      My prior fix went a bit too far, according to Herbert and Mathieu.
      
      Since we accept that concurrent TCP MD5 lookups might see inconsistent
      keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
      
      Clearing all key->key[] is needed to avoid possible KMSAN reports,
      if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
      using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
      
      data_race() was added in linux-5.8 and will prevent KCSAN reports,
      this can safely be removed in stable backports, if data_race() is
      not yet backported.
      
      v2: use data_race() both in tcp_md5_hash_key() and tcp_md5_do_add()
      
      Fixes: 6a2febec ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Marco Elver <elver@google.com>
      Reviewed-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6ced831
  5. 01 Jul, 2020 4 commits
    • Sean Tranchetti's avatar
      genetlink: remove genl_bind · 1e82a62f
      Sean Tranchetti authored
      A potential deadlock can occur during registering or unregistering a
      new generic netlink family between the main nl_table_lock and the
      cb_lock where each thread wants the lock held by the other, as
      demonstrated below.
      
      1) Thread 1 is performing a netlink_bind() operation on a socket. As part
         of this call, it will call netlink_lock_table(), incrementing the
         nl_table_users count to 1.
      2) Thread 2 is registering (or unregistering) a genl_family via the
         genl_(un)register_family() API. The cb_lock semaphore will be taken for
         writing.
      3) Thread 1 will call genl_bind() as part of the bind operation to handle
         subscribing to GENL multicast groups at the request of the user. It will
         attempt to take the cb_lock semaphore for reading, but it will fail and
         be scheduled away, waiting for Thread 2 to finish the write.
      4) Thread 2 will call netlink_table_grab() during the (un)registration
         call. However, as Thread 1 has incremented nl_table_users, it will not
         be able to proceed, and both threads will be stuck waiting for the
         other.
      
      genl_bind() is a noop, unless a genl_family implements the mcast_bind()
      function to handle setting up family-specific multicast operations. Since
      no one in-tree uses this functionality as Cong pointed out, simply removing
      the genl_bind() function will remove the possibility for deadlock, as there
      is no attempt by Thread 1 above to take the cb_lock semaphore.
      
      Fixes: c380d9a7 ("genetlink: pass multicast bind/unbind to families")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarSean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e82a62f
    • Luo bin's avatar
      hinic: fix passing non negative value to ERR_PTR · d3c54f7f
      Luo bin authored
      get_dev_cap and set_resources_state functions may return a positive
      value because of hardware failure, and the positive return value
      can not be passed to ERR_PTR directly.
      
      Fixes: 7dd29ee1 ("hinic: add sriov feature support")
      Signed-off-by: default avatarLuo bin <luobin9@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3c54f7f
    • Dan Carpenter's avatar
      net: qrtr: Fix an out of bounds read qrtr_endpoint_post() · 8ff41cc2
      Dan Carpenter authored
      This code assumes that the user passed in enough data for a
      qrtr_hdr_v1 or qrtr_hdr_v2 struct, but it's not necessarily true.  If
      the buffer is too small then it will read beyond the end.
      Reported-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Reported-by: syzbot+b8fe393f999a291a9ea6@syzkaller.appspotmail.com
      Fixes: 194ccc88 ("net: qrtr: Support decoding incoming v2 packets")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ff41cc2
    • Eric Dumazet's avatar
      tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key() · 6a2febec
      Eric Dumazet authored
      MD5 keys are read with RCU protection, and tcp_md5_do_add()
      might update in-place a prior key.
      
      Normally, typical RCU updates would allocate a new piece
      of memory. In this case only key->key and key->keylen might
      be updated, and we do not care if an incoming packet could
      see the old key, the new one, or some intermediate value,
      since changing the key on a live flow is known to be problematic
      anyway.
      
      We only want to make sure that in the case key->keylen
      is changed, cpus in tcp_md5_hash_key() wont try to use
      uninitialized data, or crash because key->keylen was
      read twice to feed sg_init_one() and ahash_request_set_crypt()
      
      Fixes: 9ea88a15 ("tcp: md5: check md5 signature without socket lock")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a2febec
  6. 30 Jun, 2020 3 commits
    • Carl Huang's avatar
      net: qrtr: free flow in __qrtr_node_release · 28541f3d
      Carl Huang authored
      The flow is allocated in qrtr_tx_wait, but not freed when qrtr node
      is released. (*slot) becomes NULL after radix_tree_iter_delete is
      called in __qrtr_node_release. The fix is to save (*slot) to a
      vairable and then free it.
      
      This memory leak is catched when kmemleak is enabled in kernel,
      the report looks like below:
      
      unreferenced object 0xffffa0de69e08420 (size 32):
        comm "kworker/u16:3", pid 176, jiffies 4294918275 (age 82858.876s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 28 84 e0 69 de a0 ff ff  ........(..i....
          28 84 e0 69 de a0 ff ff 03 00 00 00 00 00 00 00  (..i............
        backtrace:
          [<00000000e252af0a>] qrtr_node_enqueue+0x38e/0x400 [qrtr]
          [<000000009cea437f>] qrtr_sendmsg+0x1e0/0x2a0 [qrtr]
          [<000000008bddbba4>] sock_sendmsg+0x5b/0x60
          [<0000000003beb43a>] qmi_send_message.isra.3+0xbe/0x110 [qmi_helpers]
          [<000000009c9ae7de>] qmi_send_request+0x1c/0x20 [qmi_helpers]
      Signed-off-by: default avatarCarl Huang <cjhuang@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28541f3d
    • Li Heng's avatar
      net: cxgb4: fix return error value in t4_prep_fw · 8a259e6b
      Li Heng authored
      t4_prep_fw goto bye tag with positive return value when something
      bad happened and which can not free resource in adap_init0.
      so fix it to return negative value.
      
      Fixes: 16e47624 ("cxgb4: Add new scheme to update T4/T5 firmware")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarLi Heng <liheng40@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a259e6b
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e708e2bd
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2020-06-30
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 28 non-merge commits during the last 9 day(s) which contain
      a total of 35 files changed, 486 insertions(+), 232 deletions(-).
      
      The main changes are:
      
      1) Fix an incorrect verifier branch elimination for PTR_TO_BTF_ID pointer
         types, from Yonghong Song.
      
      2) Fix UAPI for sockmap and flow_dissector progs that were ignoring various
         arguments passed to BPF_PROG_{ATTACH,DETACH}, from Lorenz Bauer & Jakub Sitnicki.
      
      3) Fix broken AF_XDP DMA hacks that are poking into dma-direct and swiotlb
         internals and integrate it properly into DMA core, from Christoph Hellwig.
      
      4) Fix RCU splat from recent changes to avoid skipping ingress policy when
         kTLS is enabled, from John Fastabend.
      
      5) Fix BPF ringbuf map to enforce size to be the power of 2 in order for its
         position masking to work, from Andrii Nakryiko.
      
      6) Fix regression from CAP_BPF work to re-allow CAP_SYS_ADMIN for loading
         of network programs, from Maciej Żenczykowski.
      
      7) Fix libbpf section name prefix for devmap progs, from Jesper Dangaard Brouer.
      
      8) Fix formatting in UAPI documentation for BPF helpers, from Quentin Monnet.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e708e2bd