1. 02 Oct, 2018 2 commits
  2. 14 Aug, 2018 2 commits
    • Willem de Bruijn's avatar
      packet: fix reserve calculation · 9f8c53fd
      Willem de Bruijn authored
      BugLink: https://bugs.launchpad.net/bugs/1777063
      
      [ Upstream commit 9aad13b0 ]
      
      Commit b84bbaf7 ("packet: in packet_snd start writing at link
      layer allocation") ensures that packet_snd always starts writing
      the link layer header in reserved headroom allocated for this
      purpose.
      
      This is needed because packets may be shorter than hard_header_len,
      in which case the space up to hard_header_len may be zeroed. But
      that necessary padding is not accounted for in skb->len.
      
      The fix, however, is buggy. It calls skb_push, which grows skb->len
      when moving skb->data back. But in this case packet length should not
      change.
      
      Instead, call skb_reserve, which moves both skb->data and skb->tail
      back, without changing length.
      
      Fixes: b84bbaf7
      
       ("packet: in packet_snd start writing at link layer allocation")
      Reported-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJuerg Haefliger <juergh@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      9f8c53fd
    • Eric Dumazet's avatar
      net/packet: refine check for priv area size · 44bb7ef9
      Eric Dumazet authored
      BugLink: https://bugs.launchpad.net/bugs/1777063
      
      [ Upstream commit eb73190f ]
      
      syzbot was able to trick af_packet again [1]
      
      Various commits tried to address the problem in the past,
      but failed to take into account V3 header size.
      
      [1]
      
      tpacket_rcv: packet too big, clamped from 72 to 4294967224. macoff=96
      BUG: KASAN: use-after-free in prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline]
      BUG: KASAN: use-after-free in prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039
      Write of size 2 at addr ffff8801cb62000e by task kworker/1:2/2106
      
      CPU: 1 PID: 2106 Comm: kworker/1:2 Not tainted 4.17.0-rc7+ #77
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: ipv6_addrconf addrconf_dad_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1b9/0x294 lib/dump_stack.c:113
       print_address_description+0x6c/0x20b mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
       __asan_report_store2_noabort+0x17/0x20 mm/kasan/report.c:436
       prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline]
       prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039
       __packet_lookup_frame_in_block net/packet/af_packet.c:1094 [inline]
       packet_current_rx_frame net/packet/af_packet.c:1117 [inline]
       tpacket_rcv+0x1866/0x3340 net/packet/af_packet.c:2282
       dev_queue_xmit_nit+0x891/0xb90 net/core/dev.c:2018
       xmit_one net/core/dev.c:3049 [inline]
       dev_hard_start_xmit+0x16b/0xc10 net/core/dev.c:3069
       __dev_queue_xmit+0x2724/0x34c0 net/core/dev.c:3584
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3617
       neigh_resolve_output+0x679/0xad0 net/core/neighbour.c:1358
       neigh_output include/net/neighbour.h:482 [inline]
       ip6_finish_output2+0xc9c/0x2810 net/ipv6/ip6_output.c:120
       ip6_finish_output+0x5fe/0xbc0 net/ipv6/ip6_output.c:154
       NF_HOOK_COND include/linux/netfilter.h:277 [inline]
       ip6_output+0x227/0x9b0 net/ipv6/ip6_output.c:171
       dst_output include/net/dst.h:444 [inline]
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ndisc_send_skb+0x100d/0x1570 net/ipv6/ndisc.c:491
       ndisc_send_ns+0x3c1/0x8d0 net/ipv6/ndisc.c:633
       addrconf_dad_work+0xbef/0x1340 net/ipv6/addrconf.c:4033
       process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
       worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
       kthread+0x345/0x410 kernel/kthread.c:240
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
      
      The buggy address belongs to the page:
      page:ffffea00072d8800 count:0 mapcount:-127 mapping:0000000000000000 index:0xffff8801cb620e80
      flags: 0x2fffc0000000000()
      raw: 02fffc0000000000 0000000000000000 ffff8801cb620e80 00000000ffffff80
      raw: ffffea00072e3820 ffffea0007132d20 0000000000000002 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8801cb61ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff8801cb61ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      >ffff8801cb620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                            ^
       ffff8801cb620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff8801cb620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      
      Fixes: 2b6867c2 ("net/packet: fix overflow in check for priv area size")
      Fixes: dc808110 ("packet: handle too big packets for PACKET_V3")
      Fixes: f6fb8f10
      
       ("af-packet: TPACKET_V3 flexible buffer implementation.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJuerg Haefliger <juergh@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      44bb7ef9
  3. 07 Jun, 2018 1 commit
  4. 23 May, 2018 2 commits
  5. 13 Mar, 2018 3 commits
    • Eric Dumazet's avatar
      net/packet: fix a race in packet_bind() and packet_notifier() · a7e35257
      Eric Dumazet authored
      BugLink: http://bugs.launchpad.net/bugs/1745047
      
      [ Upstream commit 15fe076e ]
      
      syzbot reported crashes [1] and provided a C repro easing bug hunting.
      
      When/if packet_do_bind() calls __unregister_prot_hook() and releases
      po->bind_lock, another thread can run packet_notifier() and process an
      NETDEV_UP event.
      
      This calls register_prot_hook() and hooks again the socket right before
      first thread is able to grab again po->bind_lock.
      
      Fixes this issue by temporarily setting po->num to 0, as suggested by
      David Miller.
      
      [1]
      dev_remove_pack: ffff8801bf16fa80 not found
      ------------[ cut here ]------------
      kernel BUG at net/core/dev.c:7945!  ( BUG_ON(!list_empty(&dev->ptype_all)); )
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      device syz0 entered promiscuous mode
      CPU: 0 PID: 3161 Comm: syzkaller404108 Not tainted 4.14.0+ #190
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      task: ffff8801cc57a500 task.stack: ffff8801cc588000
      RIP: 0010:netdev_run_todo+0x772/0xae0 net/core/dev.c:7945
      RSP: 0018:ffff8801cc58f598 EFLAGS: 00010293
      RAX: ffff8801cc57a500 RBX: dffffc0000000000 RCX: ffffffff841f75b2
      RDX: 0000000000000000 RSI: 1ffff100398b1ede RDI: ffff8801bf1f8810
      device syz0 entered promiscuous mode
      RBP: ffff8801cc58f898 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801bf1f8cd8
      R13: ffff8801cc58f870 R14: ffff8801bf1f8780 R15: ffff8801cc58f7f0
      FS:  0000000001716880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020b13000 CR3: 0000000005e25000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:106
       tun_detach drivers/net/tun.c:670 [inline]
       tun_chr_close+0x49/0x60 drivers/net/tun.c:2845
       __fput+0x333/0x7f0 fs/file_table.c:210
       ____fput+0x15/0x20 fs/file_table.c:244
       task_work_run+0x199/0x270 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x9bb/0x1ae0 kernel/exit.c:865
       do_group_exit+0x149/0x400 kernel/exit.c:968
       SYSC_exit_group kernel/exit.c:979 [inline]
       SyS_exit_group+0x1d/0x20 kernel/exit.c:977
       entry_SYSCALL_64_fastpath+0x1f/0x96
      RIP: 0033:0x44ad19
      
      Fixes: 30f7ea1c
      
       ("packet: race condition in packet_bind")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Francesco Ruggeri <fruggeri@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      a7e35257
    • Mike Maloney's avatar
      packet: fix crash in fanout_demux_rollover() · f8906ebe
      Mike Maloney authored
      BugLink: http://bugs.launchpad.net/bugs/1745047
      
      syzkaller found a race condition fanout_demux_rollover() while removing
      a packet socket from a fanout group.
      
      po->rollover is read and operated on during packet_rcv_fanout(), via
      fanout_demux_rollover(), but the pointer is currently cleared before the
      synchronization in packet_release().   It is safer to delay the cleanup
      until after synchronize_net() has been called, ensuring all calls to
      packet_rcv_fanout() for this socket have finished.
      
      To further simplify synchronization around the rollover structure, set
      po->rollover in fanout_add() only if there are no errors.  This removes
      the need for rcu in the struct and in the call to
      packet_getsockopt(..., PACKET_ROLLOVER_STATS, ...).
      
      Crashing stack trace:
       fanout_demux_rollover+0xb6/0x4d0 net/packet/af_packet.c:1392
       packet_rcv_fanout+0x649/0x7c8 net/packet/af_packet.c:1487
       dev_queue_xmit_nit+0x835/0xc10 net/core/dev.c:1953
       xmit_one net/core/dev.c:2975 [inline]
       dev_hard_start_xmit+0x16b/0xac0 net/core/dev.c:2995
       __dev_queue_xmit+0x17a4/0x2050 net/core/dev.c:3476
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3509
       neigh_connected_output+0x489/0x720 net/core/neighbour.c:1379
       neigh_output include/net/neighbour.h:482 [inline]
       ip6_finish_output2+0xad1/0x22a0 net/ipv6/ip6_output.c:120
       ip6_finish_output+0x2f9/0x920 net/ipv6/ip6_output.c:146
       NF_HOOK_COND include/linux/netfilter.h:239 [inline]
       ip6_output+0x1f4/0x850 net/ipv6/ip6_output.c:163
       dst_output include/net/dst.h:459 [inline]
       NF_HOOK.constprop.35+0xff/0x630 include/linux/netfilter.h:250
       mld_sendpack+0x6a8/0xcc0 net/ipv6/mcast.c:1660
       mld_send_initial_cr.part.24+0x103/0x150 net/ipv6/mcast.c:2072
       mld_send_initial_cr net/ipv6/mcast.c:2056 [inline]
       ipv6_mc_dad_complete+0x99/0x130 net/ipv6/mcast.c:2079
       addrconf_dad_completed+0x595/0x970 net/ipv6/addrconf.c:4039
       addrconf_dad_work+0xac9/0x1160 net/ipv6/addrconf.c:3971
       process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2113
       worker_thread+0x223/0x1990 kernel/workqueue.c:2247
       kthread+0x35e/0x430 kernel/kthread.c:231
       ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:432
      
      Fixes: 0648ab70 ("packet: rollover prepare: per-socket state")
      Fixes: 509c7a1e
      
       ("packet: avoid panic in packet_getsockopt()")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarMike Maloney <maloney@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      f8906ebe
    • Eric Dumazet's avatar
      packet: avoid panic in packet_getsockopt() · 35e4e240
      Eric Dumazet authored
      BugLink: http://bugs.launchpad.net/bugs/1744636
      
      [ Upstream commit 509c7a1e ]
      
      syzkaller got crashes in packet_getsockopt() processing
      PACKET_ROLLOVER_STATS command while another thread was managing
      to change po->rollover
      
      Using RCU will fix this bug. We might later add proper RCU annotations
      for sparse sake.
      
      In v2: I replaced kfree(rollover) in fanout_add() to kfree_rcu()
      variant, as spotted by John.
      
      Fixes: a9b63918
      
       ("packet: rollover statistics")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: John Sperbeck <jsperbeck@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      35e4e240
  6. 01 Nov, 2017 3 commits
  7. 19 Oct, 2017 1 commit
  8. 22 Aug, 2017 1 commit
  9. 11 Aug, 2017 2 commits
  10. 07 Aug, 2017 1 commit
    • Willem de Bruijn's avatar
      net-packet: fix race in packet_set_ring on PACKET_RESERVE · ccf7bb73
      Willem de Bruijn authored
      PACKET_RESERVE reserves headroom in memory mapped packet ring frames.
      The value po->tp_reserve must is verified to be safe in packet_set_ring
      
        if (unlikely(req->tp_frame_size < po->tp_hdrlen + po->tp_reserve))
      
      and the setsockopt fails once a ring is set.
      
        if (po->rx_ring.pg_vec || po->tx_ring.pg_vec)
                return -EBUSY;
      
      This operation does not take the socket lock. This leads to a race
      similar to the one with PACKET_VERSION fixed in commit 84ac7260
      
      
      ("packet: fix race condition in packet_set_ring").
      
      Fix this issue in the same manner: take the socket lock, which as of
      that patch is held for the duration of packet_set_ring.
      
      This bug was discovered with syzkaller.
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      
      CVE-2017-1000111
      
      (backported from email submission)
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      ccf7bb73
  11. 06 Apr, 2017 4 commits
    • Alexander Potapenko's avatar
      net: don't call strlen() on the user buffer in packet_bind_spkt() · 7edb8ba0
      Alexander Potapenko authored
      BugLink: http://bugs.launchpad.net/bugs/1675789
      
      [ Upstream commit 540e2894
      
       ]
      
      KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
      uninitialized memory in packet_bind_spkt():
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      
      ==================================================================
      BUG: KMSAN: use of unitialized memory
      CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
       0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
       ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
       0000000000000000 0000000000000092 00000000ec400911 0000000000000002
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff82559ae8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
       [<ffffffff818a6626>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
       [<ffffffff818a783b>] __msan_warning+0x5b/0xb0
      mm/kmsan/kmsan_instr.c:424
       [<     inline     >] strlen lib/string.c:484
       [<ffffffff8259b58d>] strlcpy+0x9d/0x200 lib/string.c:144
       [<ffffffff84b2eca4>] packet_bind_spkt+0x144/0x230
      net/packet/af_packet.c:3132
       [<ffffffff84242e4d>] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
       [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
      arch/x86/entry/entry_64.o:?
      chained origin: 00000000eba00911
       [<ffffffff810bb787>] save_stack_trace+0x27/0x50
      arch/x86/kernel/stacktrace.c:67
       [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
       [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:334
       [<ffffffff818a59f8>] kmsan_internal_chain_origin+0x118/0x1e0
      mm/kmsan/kmsan.c:527
       [<ffffffff818a7773>] __msan_set_alloca_origin4+0xc3/0x130
      mm/kmsan/kmsan_instr.c:380
       [<ffffffff84242b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
       [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
      arch/x86/entry/entry_64.o:?
      origin description: ----address@SYSC_bind (origin=00000000eb400911)
      ==================================================================
      (the line numbers are relative to 4.8-rc6, but the bug persists
      upstream)
      
      , when I run the following program as root:
      
      =====================================
       #include <string.h>
       #include <sys/socket.h>
       #include <netpacket/packet.h>
       #include <net/ethernet.h>
      
       int main() {
         struct sockaddr addr;
         memset(&addr, 0xff, sizeof(addr));
         addr.sa_family = AF_PACKET;
         int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
         bind(fd, &addr, sizeof(addr));
         return 0;
       }
      =====================================
      
      This happens because addr.sa_data copied from the userspace is not
      zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
      results in calling strlen() on the kernel copy of that non-terminated
      buffer.
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      7edb8ba0
    • Anoob Soman's avatar
      packet: Do not call fanout_release from atomic contexts · adce2e62
      Anoob Soman authored
      BugLink: http://bugs.launchpad.net/bugs/1669016
      
      [ Upstream commit 2bd624b4 ]
      
      Commit 66644982 ("packet: call fanout_release, while UNREGISTERING a
      netdev"), unfortunately, introduced the following issues.
      
      1. calling mutex_lock(&fanout_mutex) (fanout_release()) from inside
      rcu_read-side critical section. rcu_read_lock disables preemption, most often,
      which prohibits calling sleeping functions.
      
      [  ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section!
      [  ]
      [  ] rcu_scheduler_active = 1, debug_locks = 0
      [  ] 4 locks held by ovs-vswitchd/1969:
      [  ]  #0:  (cb_lock){++++++}, at: [<ffffffff8158a6c9>] genl_rcv+0x19/0x40
      [  ]  #1:  (ovs_mutex){+.+.+.}, at: [<ffffffffa04878ca>] ovs_vport_cmd_del+0x4a/0x100 [openvswitch]
      [  ]  #2:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81564157>] rtnl_lock+0x17/0x20
      [  ]  #3:  (rcu_read_lock){......}, at: [<ffffffff81614165>] packet_notifier+0x5/0x3f0
      [  ]
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff810c9077>] lockdep_rcu_suspicious+0x107/0x110
      [  ]  [<ffffffff810a2da7>] ___might_sleep+0x57/0x210
      [  ]  [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
      [  ]  [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
      [  ]  [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30
      [  ]  [<ffffffff81186e88>] ? printk+0x4d/0x4f
      [  ]  [<ffffffff816106dd>] fanout_release+0x1d/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      2. calling mutex_lock(&fanout_mutex) inside spin_lock(&po->bind_lock).
      "sleeping function called from invalid context"
      
      [  ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
      [  ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd
      [  ] INFO: lockdep is turned off.
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff810a2f52>] ___might_sleep+0x202/0x210
      [  ]  [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
      [  ]  [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
      [  ]  [<ffffffff816106dd>] fanout_release+0x1d/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      3. calling dev_remove_pack(&fanout->prot_hook), from inside
      spin_lock(&po->bind_lock) or rcu_read-side critical-section. dev_remove_pack()
      -> synchronize_net(), which might sleep.
      
      [  ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002
      [  ] INFO: lockdep is turned off.
      [  ] Call Trace:
      [  ]  [<ffffffff813770c1>] dump_stack+0x85/0xc4
      [  ]  [<ffffffff81186274>] __schedule_bug+0x64/0x73
      [  ]  [<ffffffff8162b8cb>] __schedule+0x6b/0xd10
      [  ]  [<ffffffff8162c5db>] schedule+0x6b/0x80
      [  ]  [<ffffffff81630b1d>] schedule_timeout+0x38d/0x410
      [  ]  [<ffffffff810ea3fd>] synchronize_sched_expedited+0x53d/0x810
      [  ]  [<ffffffff810ea6de>] synchronize_rcu_expedited+0xe/0x10
      [  ]  [<ffffffff8154eab5>] synchronize_net+0x35/0x50
      [  ]  [<ffffffff8154eae3>] dev_remove_pack+0x13/0x20
      [  ]  [<ffffffff8161077e>] fanout_release+0xbe/0xe0
      [  ]  [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
      
      4. fanout_release() races with calls from different CPU.
      
      To fix the above problems, remove the call to fanout_release() under
      rcu_read_lock(). Instead, call __dev_remove_pack(&fanout->prot_hook) and
      netdev_run_todo will be happy that &dev->ptype_specific list is empty. In order
      to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to
      __fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure
      fanout->prot_hook is removed as well.
      
      Fixes: 66644982
      
       ("packet: call fanout_release, while UNREGISTERING a netdev")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAnoob Soman <anoob.soman@citrix.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      adce2e62
    • Eric Dumazet's avatar
      packet: fix races in fanout_add() · cb2344c6
      Eric Dumazet authored
      BugLink: http://bugs.launchpad.net/bugs/1669016
      
      [ Upstream commit d199fab6 ]
      
      Multiple threads can call fanout_add() at the same time.
      
      We need to grab fanout_mutex earlier to avoid races that could
      lead to one thread freeing po->rollover that was set by another thread.
      
      Do the same in fanout_release(), for peace of mind, and to help us
      finding lockdep issues earlier.
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Fixes: 0648ab70
      
       ("packet: rollover prepare: per-socket state")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      cb2344c6
    • Willem de Bruijn's avatar
      packet: round up linear to header len · a0ca27c8
      Willem de Bruijn authored
      BugLink: http://bugs.launchpad.net/bugs/1666324
      
      [ Upstream commit 57031eb7
      
       ]
      
      Link layer protocols may unconditionally pull headers, as Ethernet
      does in eth_type_trans. Ensure that the entire link layer header
      always lies in the skb linear segment. tpacket_snd has such a check.
      Extend this to packet_snd.
      
      Variable length link layer headers complicate the computation
      somewhat. Here skb->len may be smaller than dev->hard_header_len.
      
      Round up the linear length to be at least as long as the smallest of
      the two.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      a0ca27c8
  12. 31 Mar, 2017 3 commits
  13. 06 Dec, 2016 2 commits
  14. 01 Dec, 2016 1 commit
  15. 09 Aug, 2016 1 commit
  16. 10 Jun, 2016 1 commit
  17. 21 Apr, 2016 1 commit
  18. 30 Nov, 2015 1 commit
  19. 17 Nov, 2015 2 commits
  20. 15 Nov, 2015 5 commits
  21. 05 Nov, 2015 1 commit
    • Francesco Ruggeri's avatar
      packet: race condition in packet_bind · 30f7ea1c
      Francesco Ruggeri authored
      
      There is a race conditions between packet_notifier and packet_bind{_spkt}.
      
      It happens if packet_notifier(NETDEV_UNREGISTER) executes between the
      time packet_bind{_spkt} takes a reference on the new netdevice and the
      time packet_do_bind sets po->ifindex.
      In this case the notification can be missed.
      If this happens during a dev_change_net_namespace this can result in the
      netdevice to be moved to the new namespace while the packet_sock in the
      old namespace still holds a reference on it. When the netdevice is later
      deleted in the new namespace the deletion hangs since the packet_sock
      is not found in the new namespace' &net->packet.sklist.
      It can be reproduced with the script below.
      
      This patch makes packet_do_bind check again for the presence of the
      netdevice in the packet_sock's namespace after the synchronize_net
      in unregister_prot_hook.
      More in general it also uses the rcu lock for the duration of the bind
      to stop dev_change_net_namespace/rollback_registered_many from
      going past the synchronize_net following unlist_netdevice, so that
      no NETDEV_UNREGISTER notifications can happen on the new netdevice
      while the bind is executing. In order to do this some code from
      packet_bind{_spkt} is consolidated into packet_do_dev.
      
      import socket, os, time, sys
      proto=7
      realDev='em1'
      vlanId=400
      if len(sys.argv) > 1:
         vlanId=int(sys.argv[1])
      dev='vlan%d' % vlanId
      
      os.system('taskset -p 0x10 %d' % os.getpid())
      
      s = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, proto)
      os.system('ip link add link %s name %s type vlan id %d' %
                (realDev, dev, vlanId))
      os.system('ip netns add dummy')
      
      pid=os.fork()
      
      if pid == 0:
         # dev should be moved while packet_do_bind is in synchronize net
         os.system('taskset -p 0x20000 %d' % os.getpid())
         os.system('ip link set %s netns dummy' % dev)
         os.system('ip netns exec dummy ip link del %s' % dev)
         s.close()
         sys.exit(0)
      
      time.sleep(.004)
      try:
         s.bind(('%s' % dev, proto+1))
      except:
         print 'Could not bind socket'
         s.close()
         os.system('ip netns del dummy')
         sys.exit(0)
      
      os.waitpid(pid, 0)
      s.close()
      os.system('ip netns del dummy')
      sys.exit(0)
      Signed-off-by: default avatarFrancesco Ruggeri <fruggeri@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30f7ea1c