1. 17 May, 2018 15 commits
    • David S. Miller's avatar
      Merge branch 'ip6_gre-Fixes-in-headroom-handling' · 374edea4
      David S. Miller authored
      Petr Machata says:
      
      ====================
      net: ip6_gre: Fixes in headroom handling
      
      This series mends some problems in headroom management in ip6_gre
      module. The current code base has the following three closely-related
      problems:
      
      - ip6gretap tunnels neglect to ensure there's enough writable headroom
        before pushing GRE headers.
      
      - ip6erspan does this, but assumes that dev->needed_headroom is primed.
        But that doesn't happen until ip6_tnl_xmit() is called later. Thus for
        the first packet, ip6erspan actually behaves like ip6gretap above.
      
      - ip6erspan shares some of the code with ip6gretap, including
        calculations of needed header length. While there is custom
        ERSPAN-specific code for calculating the headroom, the computed
        values are overwritten by the ip6gretap code.
      
      The first two issues lead to a kernel panic in situations where a packet
      is mirrored from a veth device to the device in question. They are
      fixed, respectively, in patches #1 and #2, which include the full panic
      trace and a reproducer.
      
      The rest of the patchset deals with the last issue. In patches #3 to #6,
      several functions are split up into reusable parts. Finally in patch #7
      these blocks are used to compose ERSPAN-specific callbacks where
      necessary to fix the hlen calculation.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      374edea4
    • Petr Machata's avatar
      net: ip6_gre: Fix ip6erspan hlen calculation · 2d665034
      Petr Machata authored
      Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
      what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
      overwrites these settings with GRE-specific ones.
      
      Similarly for changelink callbacks, which are handled by
      ip6gre_changelink() calls ip6gre_tnl_change() calls
      ip6gre_tnl_link_config() as well.
      
      The difference ends up being 12 vs. 20 bytes, and this is generally not
      a problem, because a 12-byte request likely ends up allocating more and
      the extra 8 bytes are thus available. However correct it is not.
      
      So replace the newlink and changelink callbacks with an ERSPAN-specific
      ones, reusing the newly-introduced _common() functions.
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d665034
    • Petr Machata's avatar
      net: ip6_gre: Split up ip6gre_changelink() · c8632fc3
      Petr Machata authored
      Extract from ip6gre_changelink() a reusable function
      ip6gre_changelink_common(). This will allow introduction of
      ERSPAN-specific _changelink() function with not a lot of code
      duplication.
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8632fc3
    • Petr Machata's avatar
      net: ip6_gre: Split up ip6gre_newlink() · 7fa38a7c
      Petr Machata authored
      Extract from ip6gre_newlink() a reusable function
      ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
      made customizable for ERSPAN, thus reorder it with calls to
      ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
      caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
      function without a lot of duplicity.
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fa38a7c
    • Petr Machata's avatar
      net: ip6_gre: Split up ip6gre_tnl_change() · a6465350
      Petr Machata authored
      Split a reusable function ip6gre_tnl_copy_tnl_parm() from
      ip6gre_tnl_change(). This will allow ERSPAN-specific code to
      reuse the common parts while customizing the behavior for ERSPAN.
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6465350
    • Petr Machata's avatar
      net: ip6_gre: Split up ip6gre_tnl_link_config() · a483373e
      Petr Machata authored
      The function ip6gre_tnl_link_config() is used for setting up
      configuration of both ip6gretap and ip6erspan tunnels. Split the
      function into the common part and the route-lookup part. The latter then
      takes the calculated header length as an argument. This split will allow
      the patches down the line to sneak in a custom header length computation
      for the ERSPAN tunnel.
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a483373e
    • Petr Machata's avatar
      net: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit() · 5691484d
      Petr Machata authored
      dev->needed_headroom is not primed until ip6_tnl_xmit(), so it starts
      out zero. Thus the call to skb_cow_head() fails to actually make sure
      there's enough headroom to push the ERSPAN headers to. That can lead to
      the panic cited below. (Reproducer below that).
      
      Fix by requesting either needed_headroom if already primed, or just the
      bare minimum needed for the header otherwise.
      
      [  190.703567] kernel BUG at net/core/skbuff.c:104!
      [  190.708384] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      [  190.714007] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_temp_thermal mlx_platform nfsd e1000e leds_mlxcpld
      [  190.728975] CPU: 1 PID: 959 Comm: kworker/1:2 Not tainted 4.17.0-rc4-net_master-custom-139 #10
      [  190.737647] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
      [  190.747006] Workqueue: ipv6_addrconf addrconf_dad_work
      [  190.752222] RIP: 0010:skb_panic+0xc3/0x100
      [  190.756358] RSP: 0018:ffff8801d54072f0 EFLAGS: 00010282
      [  190.761629] RAX: 0000000000000085 RBX: ffff8801c1a8ecc0 RCX: 0000000000000000
      [  190.768830] RDX: 0000000000000085 RSI: dffffc0000000000 RDI: ffffed003aa80e54
      [  190.776025] RBP: ffff8801bd1ec5a0 R08: ffffed003aabce19 R09: ffffed003aabce19
      [  190.783226] R10: 0000000000000001 R11: ffffed003aabce18 R12: ffff8801bf695dbe
      [  190.790418] R13: 0000000000000084 R14: 00000000000006c0 R15: ffff8801bf695dc8
      [  190.797621] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
      [  190.805786] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  190.811582] CR2: 000055fa929aced0 CR3: 0000000003228004 CR4: 00000000001606e0
      [  190.818790] Call Trace:
      [  190.821264]  <IRQ>
      [  190.823314]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
      [  190.828940]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
      [  190.834562]  skb_push+0x78/0x90
      [  190.837749]  ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
      [  190.843219]  ? ip6gre_tunnel_ioctl+0xd90/0xd90 [ip6_gre]
      [  190.848577]  ? debug_check_no_locks_freed+0x210/0x210
      [  190.853679]  ? debug_check_no_locks_freed+0x210/0x210
      [  190.858783]  ? print_irqtrace_events+0x120/0x120
      [  190.863451]  ? sched_clock_cpu+0x18/0x210
      [  190.867496]  ? cyc2ns_read_end+0x10/0x10
      [  190.871474]  ? skb_network_protocol+0x76/0x200
      [  190.875977]  dev_hard_start_xmit+0x137/0x770
      [  190.880317]  ? do_raw_spin_trylock+0x6d/0xa0
      [  190.884624]  sch_direct_xmit+0x2ef/0x5d0
      [  190.888589]  ? pfifo_fast_dequeue+0x3fa/0x670
      [  190.892994]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
      [  190.898455]  ? __lock_is_held+0xa0/0x160
      [  190.902422]  __qdisc_run+0x39e/0xfc0
      [  190.906041]  ? _raw_spin_unlock+0x29/0x40
      [  190.910090]  ? pfifo_fast_enqueue+0x24b/0x3e0
      [  190.914501]  ? sch_direct_xmit+0x5d0/0x5d0
      [  190.918658]  ? pfifo_fast_dequeue+0x670/0x670
      [  190.923047]  ? __dev_queue_xmit+0x172/0x1770
      [  190.927365]  ? preempt_count_sub+0xf/0xd0
      [  190.931421]  __dev_queue_xmit+0x410/0x1770
      [  190.935553]  ? ___slab_alloc+0x605/0x930
      [  190.939524]  ? print_irqtrace_events+0x120/0x120
      [  190.944186]  ? memcpy+0x34/0x50
      [  190.947364]  ? netdev_pick_tx+0x1c0/0x1c0
      [  190.951428]  ? __skb_clone+0x2fd/0x3d0
      [  190.955218]  ? __copy_skb_header+0x270/0x270
      [  190.959537]  ? rcu_read_lock_sched_held+0x93/0xa0
      [  190.964282]  ? kmem_cache_alloc+0x344/0x4d0
      [  190.968520]  ? cyc2ns_read_end+0x10/0x10
      [  190.972495]  ? skb_clone+0x123/0x230
      [  190.976112]  ? skb_split+0x820/0x820
      [  190.979747]  ? tcf_mirred+0x554/0x930 [act_mirred]
      [  190.984582]  tcf_mirred+0x554/0x930 [act_mirred]
      [  190.989252]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
      [  190.996109]  ? __lock_acquire+0x706/0x26e0
      [  191.000239]  ? sched_clock_cpu+0x18/0x210
      [  191.004294]  tcf_action_exec+0xcf/0x2a0
      [  191.008179]  tcf_classify+0xfa/0x340
      [  191.011794]  __netif_receive_skb_core+0x8e1/0x1c60
      [  191.016630]  ? debug_check_no_locks_freed+0x210/0x210
      [  191.021732]  ? nf_ingress+0x500/0x500
      [  191.025458]  ? process_backlog+0x347/0x4b0
      [  191.029619]  ? print_irqtrace_events+0x120/0x120
      [  191.034302]  ? lock_acquire+0xd8/0x320
      [  191.038089]  ? process_backlog+0x1b6/0x4b0
      [  191.042246]  ? process_backlog+0xc2/0x4b0
      [  191.046303]  process_backlog+0xc2/0x4b0
      [  191.050189]  net_rx_action+0x5cc/0x980
      [  191.053991]  ? napi_complete_done+0x2c0/0x2c0
      [  191.058386]  ? mark_lock+0x13d/0xb40
      [  191.062001]  ? clockevents_program_event+0x6b/0x1d0
      [  191.066922]  ? print_irqtrace_events+0x120/0x120
      [  191.071593]  ? __lock_is_held+0xa0/0x160
      [  191.075566]  __do_softirq+0x1d4/0x9d2
      [  191.079282]  ? ip6_finish_output2+0x524/0x1460
      [  191.083771]  do_softirq_own_stack+0x2a/0x40
      [  191.087994]  </IRQ>
      [  191.090130]  do_softirq.part.13+0x38/0x40
      [  191.094178]  __local_bh_enable_ip+0x135/0x190
      [  191.098591]  ip6_finish_output2+0x54d/0x1460
      [  191.102916]  ? ip6_forward_finish+0x2f0/0x2f0
      [  191.107314]  ? ip6_mtu+0x3c/0x2c0
      [  191.110674]  ? ip6_finish_output+0x2f8/0x650
      [  191.114992]  ? ip6_output+0x12a/0x500
      [  191.118696]  ip6_output+0x12a/0x500
      [  191.122223]  ? ip6_route_dev_notify+0x5b0/0x5b0
      [  191.126807]  ? ip6_finish_output+0x650/0x650
      [  191.131120]  ? ip6_fragment+0x1a60/0x1a60
      [  191.135182]  ? icmp6_dst_alloc+0x26e/0x470
      [  191.139317]  mld_sendpack+0x672/0x830
      [  191.143021]  ? igmp6_mcf_seq_next+0x2f0/0x2f0
      [  191.147429]  ? __local_bh_enable_ip+0x77/0x190
      [  191.151913]  ipv6_mc_dad_complete+0x47/0x90
      [  191.156144]  addrconf_dad_completed+0x561/0x720
      [  191.160731]  ? addrconf_rs_timer+0x3a0/0x3a0
      [  191.165036]  ? mark_held_locks+0xc9/0x140
      [  191.169095]  ? __local_bh_enable_ip+0x77/0x190
      [  191.173570]  ? addrconf_dad_work+0x50d/0xa20
      [  191.177886]  ? addrconf_dad_work+0x529/0xa20
      [  191.182194]  addrconf_dad_work+0x529/0xa20
      [  191.186342]  ? addrconf_dad_completed+0x720/0x720
      [  191.191088]  ? __lock_is_held+0xa0/0x160
      [  191.195059]  ? process_one_work+0x45d/0xe20
      [  191.199302]  ? process_one_work+0x51e/0xe20
      [  191.203531]  ? rcu_read_lock_sched_held+0x93/0xa0
      [  191.208279]  process_one_work+0x51e/0xe20
      [  191.212340]  ? pwq_dec_nr_in_flight+0x200/0x200
      [  191.216912]  ? get_lock_stats+0x4b/0xf0
      [  191.220788]  ? preempt_count_sub+0xf/0xd0
      [  191.224844]  ? worker_thread+0x219/0x860
      [  191.228823]  ? do_raw_spin_trylock+0x6d/0xa0
      [  191.233142]  worker_thread+0xeb/0x860
      [  191.236848]  ? process_one_work+0xe20/0xe20
      [  191.241095]  kthread+0x206/0x300
      [  191.244352]  ? process_one_work+0xe20/0xe20
      [  191.248587]  ? kthread_stop+0x570/0x570
      [  191.252459]  ret_from_fork+0x3a/0x50
      [  191.256082] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
      [  191.275327] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d54072f0
      [  191.281024] ---[ end trace 7ea51094e099e006 ]---
      [  191.285724] Kernel panic - not syncing: Fatal exception in interrupt
      [  191.292168] Kernel Offset: disabled
      [  191.295697] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      Reproducer:
      
      	ip link add h1 type veth peer name swp1
      	ip link add h3 type veth peer name swp3
      
      	ip link set dev h1 up
      	ip address add 192.0.2.1/28 dev h1
      
      	ip link add dev vh3 type vrf table 20
      	ip link set dev h3 master vh3
      	ip link set dev vh3 up
      	ip link set dev h3 up
      
      	ip link set dev swp3 up
      	ip address add dev swp3 2001:db8:2::1/64
      
      	ip link set dev swp1 up
      	tc qdisc add dev swp1 clsact
      
      	ip link add name gt6 type ip6erspan \
      		local 2001:db8:2::1 remote 2001:db8:2::2 oseq okey 123
      	ip link set dev gt6 up
      
      	sleep 1
      
      	tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
      		action mirred egress mirror dev gt6
      	ping -I h1 192.0.2.2
      
      Fixes: e41c7c68 ("ip6erspan: make sure enough headroom at xmit.")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5691484d
    • Petr Machata's avatar
      net: ip6_gre: Request headroom in __gre6_xmit() · 01b8d064
      Petr Machata authored
      __gre6_xmit() pushes GRE headers before handing over to ip6_tnl_xmit()
      for generic IP-in-IP processing. However it doesn't make sure that there
      is enough headroom to push the header to. That can lead to the panic
      cited below. (Reproducer below that).
      
      Fix by requesting either needed_headroom if already primed, or just the
      bare minimum needed for the header otherwise.
      
      [  158.576725] kernel BUG at net/core/skbuff.c:104!
      [  158.581510] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      [  158.587174] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_temp_thermal mlx_platform nfsd e1000e leds_mlxcpld
      [  158.602268] CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 4.17.0-rc4-net_master-custom-139 #10
      [  158.610938] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
      [  158.620426] RIP: 0010:skb_panic+0xc3/0x100
      [  158.624586] RSP: 0018:ffff8801d3f27110 EFLAGS: 00010286
      [  158.629882] RAX: 0000000000000082 RBX: ffff8801c02cc040 RCX: 0000000000000000
      [  158.637127] RDX: 0000000000000082 RSI: dffffc0000000000 RDI: ffffed003a7e4e18
      [  158.644366] RBP: ffff8801bfec8020 R08: ffffed003aabce19 R09: ffffed003aabce19
      [  158.651574] R10: 000000000000000b R11: ffffed003aabce18 R12: ffff8801c364de66
      [  158.658786] R13: 000000000000002c R14: 00000000000000c0 R15: ffff8801c364de68
      [  158.666007] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
      [  158.674212] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  158.680036] CR2: 00007f4b3702dcd0 CR3: 0000000003228002 CR4: 00000000001606e0
      [  158.687228] Call Trace:
      [  158.689752]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
      [  158.694475]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
      [  158.699141]  skb_push+0x78/0x90
      [  158.702344]  __gre6_xmit+0x246/0xd80 [ip6_gre]
      [  158.706872]  ip6gre_tunnel_xmit+0x3bc/0x610 [ip6_gre]
      [  158.711992]  ? __gre6_xmit+0xd80/0xd80 [ip6_gre]
      [  158.716668]  ? debug_check_no_locks_freed+0x210/0x210
      [  158.721761]  ? print_irqtrace_events+0x120/0x120
      [  158.726461]  ? sched_clock_cpu+0x18/0x210
      [  158.730572]  ? sched_clock_cpu+0x18/0x210
      [  158.734692]  ? cyc2ns_read_end+0x10/0x10
      [  158.738705]  ? skb_network_protocol+0x76/0x200
      [  158.743216]  ? netif_skb_features+0x1b2/0x550
      [  158.747648]  dev_hard_start_xmit+0x137/0x770
      [  158.752010]  sch_direct_xmit+0x2ef/0x5d0
      [  158.755992]  ? pfifo_fast_dequeue+0x3fa/0x670
      [  158.760460]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
      [  158.765975]  ? __lock_is_held+0xa0/0x160
      [  158.770002]  __qdisc_run+0x39e/0xfc0
      [  158.773673]  ? _raw_spin_unlock+0x29/0x40
      [  158.777781]  ? pfifo_fast_enqueue+0x24b/0x3e0
      [  158.782191]  ? sch_direct_xmit+0x5d0/0x5d0
      [  158.786372]  ? pfifo_fast_dequeue+0x670/0x670
      [  158.790818]  ? __dev_queue_xmit+0x172/0x1770
      [  158.795195]  ? preempt_count_sub+0xf/0xd0
      [  158.799313]  __dev_queue_xmit+0x410/0x1770
      [  158.803512]  ? ___slab_alloc+0x605/0x930
      [  158.807525]  ? ___slab_alloc+0x605/0x930
      [  158.811540]  ? memcpy+0x34/0x50
      [  158.814768]  ? netdev_pick_tx+0x1c0/0x1c0
      [  158.818895]  ? __skb_clone+0x2fd/0x3d0
      [  158.822712]  ? __copy_skb_header+0x270/0x270
      [  158.827079]  ? rcu_read_lock_sched_held+0x93/0xa0
      [  158.831903]  ? kmem_cache_alloc+0x344/0x4d0
      [  158.836199]  ? skb_clone+0x123/0x230
      [  158.839869]  ? skb_split+0x820/0x820
      [  158.843521]  ? tcf_mirred+0x554/0x930 [act_mirred]
      [  158.848407]  tcf_mirred+0x554/0x930 [act_mirred]
      [  158.853104]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
      [  158.860005]  ? __lock_acquire+0x706/0x26e0
      [  158.864162]  ? mark_lock+0x13d/0xb40
      [  158.867832]  tcf_action_exec+0xcf/0x2a0
      [  158.871736]  tcf_classify+0xfa/0x340
      [  158.875402]  __netif_receive_skb_core+0x8e1/0x1c60
      [  158.880334]  ? nf_ingress+0x500/0x500
      [  158.884059]  ? process_backlog+0x347/0x4b0
      [  158.888241]  ? lock_acquire+0xd8/0x320
      [  158.892050]  ? process_backlog+0x1b6/0x4b0
      [  158.896228]  ? process_backlog+0xc2/0x4b0
      [  158.900291]  process_backlog+0xc2/0x4b0
      [  158.904210]  net_rx_action+0x5cc/0x980
      [  158.908047]  ? napi_complete_done+0x2c0/0x2c0
      [  158.912525]  ? rcu_read_unlock+0x80/0x80
      [  158.916534]  ? __lock_is_held+0x34/0x160
      [  158.920541]  __do_softirq+0x1d4/0x9d2
      [  158.924308]  ? trace_event_raw_event_irq_handler_exit+0x140/0x140
      [  158.930515]  run_ksoftirqd+0x1d/0x40
      [  158.934152]  smpboot_thread_fn+0x32b/0x690
      [  158.938299]  ? sort_range+0x20/0x20
      [  158.941842]  ? preempt_count_sub+0xf/0xd0
      [  158.945940]  ? schedule+0x5b/0x140
      [  158.949412]  kthread+0x206/0x300
      [  158.952689]  ? sort_range+0x20/0x20
      [  158.956249]  ? kthread_stop+0x570/0x570
      [  158.960164]  ret_from_fork+0x3a/0x50
      [  158.963823] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
      [  158.983235] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d3f27110
      [  158.988935] ---[ end trace 5af56ee845aa6cc8 ]---
      [  158.993641] Kernel panic - not syncing: Fatal exception in interrupt
      [  159.000176] Kernel Offset: disabled
      [  159.003767] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      Reproducer:
      
      	ip link add h1 type veth peer name swp1
      	ip link add h3 type veth peer name swp3
      
      	ip link set dev h1 up
      	ip address add 192.0.2.1/28 dev h1
      
      	ip link add dev vh3 type vrf table 20
      	ip link set dev h3 master vh3
      	ip link set dev vh3 up
      	ip link set dev h3 up
      
      	ip link set dev swp3 up
      	ip address add dev swp3 2001:db8:2::1/64
      
      	ip link set dev swp1 up
      	tc qdisc add dev swp1 clsact
      
      	ip link add name gt6 type ip6gretap \
      		local 2001:db8:2::1 remote 2001:db8:2::2
      	ip link set dev gt6 up
      
      	sleep 1
      
      	tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
      		action mirred egress mirror dev gt6
      	ping -I h1 192.0.2.2
      
      Fixes: c12b395a ("gre: Support GRE over IPv6")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01b8d064
    • William Tu's avatar
      erspan: fix invalid erspan version. · 02f99df1
      William Tu authored
      ERSPAN only support version 1 and 2.  When packets send to an
      erspan device which does not have proper version number set,
      drop the packet.  In real case, we observe multicast packets
      sent to the erspan pernet device, erspan0, which does not have
      erspan version configured.
      Reported-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02f99df1
    • David S. Miller's avatar
      Merge branch 'ibmvnic-Fix-bugs-and-memory-leaks' · d13d170c
      David S. Miller authored
      Thomas Falcon says:
      
      ====================
      ibmvnic: Fix bugs and memory leaks
      
      This is a small patch series fixing up some bugs and memory leaks
      in the ibmvnic driver. The first fix frees up previously allocated
      memory that should be freed in case of an error. The second fixes
      a reset case that was failing due to TX/RX queue IRQ's being
      erroneously disabled without being enabled again. The final patch
      fixes incorrect reallocated of statistics buffers during a device
      reset, resulting in loss of statistics information and a memory leak.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d13d170c
    • Thomas Falcon's avatar
      ibmvnic: Fix statistics buffers memory leak · 07184213
      Thomas Falcon authored
      Move initialization of statistics buffers from ibmvnic_init function
      into ibmvnic_probe. In the current state, ibmvnic_init will be called
      again during a device reset, resulting in the allocation of new
      buffers without freeing the old ones.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07184213
    • Thomas Falcon's avatar
      ibmvnic: Fix non-fatal firmware error reset · 134bbe7f
      Thomas Falcon authored
      It is not necessary to disable interrupt lines here during a reset
      to handle a non-fatal firmware error. Move that call within the code
      block that handles the other cases that do require interrupts to be
      disabled and re-enabled.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      134bbe7f
    • Thomas Falcon's avatar
      ibmvnic: Free coherent DMA memory if FW map failed · 4cf2ddf3
      Thomas Falcon authored
      If the firmware map fails for whatever reason, remember to free
      up the memory after.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4cf2ddf3
    • David Ahern's avatar
      net/ipv4: Initialize proto and ports in flow struct · 5a847a6e
      David Ahern authored
      Updating the FIB tracepoint for the recent change to allow rules using
      the protocol and ports exposed a few places where the entries in the flow
      struct are not initialized.
      
      For __fib_validate_source add the call to fib4_rules_early_flow_dissect
      since it is invoked for the input path. For netfilter, add the memset on
      the flow struct to avoid future problems like this. In ip_route_input_slow
      need to set the fields if the skb dissection does not happen.
      
      Fixes: bfff4862 ("net: fib_rules: support for match on ip_proto, sport and dport")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a847a6e
    • Matt Mullins's avatar
      tls: don't use stack memory in a scatterlist · 8ab6ffba
      Matt Mullins authored
      scatterlist code expects virt_to_page() to work, which fails with
      CONFIG_VMAP_STACK=y.
      
      Fixes: c46234eb ("tls: RX path for ktls")
      Signed-off-by: default avatarMatt Mullins <mmullins@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ab6ffba
  2. 16 May, 2018 15 commits
  3. 15 May, 2018 3 commits
  4. 14 May, 2018 7 commits
    • Eric Biggers's avatar
      net/smc: check for missing nlattrs in SMC_PNETID messages · d49baa7e
      Eric Biggers authored
      It's possible to crash the kernel in several different ways by sending
      messages to the SMC_PNETID generic netlink family that are missing the
      expected attributes:
      
      - Missing SMC_PNETID_NAME => null pointer dereference when comparing
        names.
      - Missing SMC_PNETID_ETHNAME => null pointer dereference accessing
        smc_pnetentry::ndev.
      - Missing SMC_PNETID_IBNAME => null pointer dereference accessing
        smc_pnetentry::smcibdev.
      - Missing SMC_PNETID_IBPORT => out of bounds array access to
        smc_ib_device::pattr[-1].
      
      Fix it by validating that all expected attributes are present and that
      SMC_PNETID_IBPORT is nonzero.
      
      Reported-by: syzbot+5cd61039dc9b8bfa6e47@syzkaller.appspotmail.com
      Fixes: 6812baab ("smc: establish pnet table management")
      Cc: <stable@vger.kernel.org> # v4.11+
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d49baa7e
    • Tarick Bedeir's avatar
      net/mlx4_core: Fix error handling in mlx4_init_port_info. · 57f6f99f
      Tarick Bedeir authored
      Avoid exiting the function with a lingering sysfs file (if the first
      call to device_create_file() fails while the second succeeds), and avoid
      calling devlink_port_unregister() twice.
      
      In other words, either mlx4_init_port_info() succeeds and returns zero, or
      it fails, returns non-zero, and requires no cleanup.
      
      Fixes: 096335b3 ("mlx4_core: Allow dynamic MTU configuration for IB ports")
      Signed-off-by: default avatarTarick Bedeir <tarick@google.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57f6f99f
    • Jason Wang's avatar
      tun: fix use after free for ptr_ring · b196d88a
      Jason Wang authored
      We used to initialize ptr_ring during TUNSETIFF, this is because its
      size depends on the tx_queue_len of netdevice. And we try to clean it
      up when socket were detached from netdevice. A race were spotted when
      trying to do uninit during a read which will lead a use after free for
      pointer ring. Solving this by always initialize a zero size ptr_ring
      in open() and do resizing during TUNSETIFF, and then we can safely do
      cleanup during close(). With this, there's no need for the workaround
      that was introduced by commit 4df0bfc7 ("tun: fix a memory leak
      for tfile->tx_array").
      
      Reported-by: syzbot+e8b902c3c3fadf0a9dba@syzkaller.appspotmail.com
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Fixes: 1576d986 ("tun: switch to use skb array for tx")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b196d88a
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 9d6b4bfb
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-05-14
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix nfp to allow zero-length BPF capabilities, meaning the nfp
         capability parsing loop will otherwise exit early if the last
         capability is zero length and therefore driver will fail to probe
         with an error such as:
      
           nfp: BPF capabilities left after parsing, parsed:92 total length:100
           nfp: invalid BPF capabilities at offset:92
      
         Fix from Jakub.
      
      2) libbpf's bpf_object__open() may return IS_ERR_OR_NULL() and not
         just an error. Fix libbpf's bpf_prog_load_xattr() to handle that
         case as well, also from Jakub.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d6b4bfb
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 4f6b15c3
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter/IPVS fixes for your net tree,
      they are:
      
      1) Fix handling of simultaneous open TCP connection in conntrack,
         from Jozsef Kadlecsik.
      
      2) Insufficient sanitify check of xtables extension names, from
         Florian Westphal.
      
      3) Skip unnecessary synchronize_rcu() call when transaction log
         is already empty, from Florian Westphal.
      
      4) Incorrect destination mac validation in ebt_stp, from Stephen
         Hemminger.
      
      5) xtables module reference counter leak in nft_compat, from
         Florian Westphal.
      
      6) Incorrect connection reference counting logic in IPVS
         one-packet scheduler, from Julian Anastasov.
      
      7) Wrong stats for 32-bits CPU in IPVS, also from Julian.
      
      8) Calm down sparse error in netfilter core, also from Florian.
      
      9) Use nla_strlcpy to fix compilation warning in nfnetlink_acct
         and nfnetlink_cthelper, again from Florian.
      
      10) Missing module alias in icmp and icmp6 xtables extensions,
          from Florian Westphal.
      
      11) Base chain statistics in nf_tables may be unset/null, from Florian.
      
      12) Fix handling of large matchinfo size in nft_compat, this includes
          one preparation for before this fix. From Florian.
      
      13) Fix bogus EBUSY error when deleting chains due to incorrect reference
          counting from the preparation phase of the two-phase commit protocol.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f6b15c3
    • Michal Kalderon's avatar
      qede: Fix ref-cnt usage count · 91dfd02b
      Michal Kalderon authored
      Rebooting while qedr is loaded with a VLAN interface present
      results in unregister_netdevice waiting for the usage count
      to become free.
      The fix is that rdma devices should be removed before unregistering
      the netdevice, to assure all references to ndev are decreased.
      
      Fixes: cee9fbd8 ("qede: Add qedr framework")
      Signed-off-by: default avatarAriel Elior <ariel.elior@cavium.com>
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91dfd02b
    • Christoph Hellwig's avatar
      3c59x: convert to generic DMA API · 55c82617
      Christoph Hellwig authored
      This driver supports EISA devices in addition to PCI devices, and relied
      on the legacy behavior of the pci_dma* shims to pass on a NULL pointer
      to the DMA API, and the DMA API being able to handle that.  When the
      NULL forwarding broke the EISA support got broken.  Fix this by converting
      to the DMA API instead of the legacy PCI shims.
      
      Fixes: 4167b2ad ("PCI: Remove NULL device handling from PCI DMA API")
      Reported-by: default avatartedheadster <tedheadster@gmail.com>
      Tested-by: default avatartedheadster <tedheadster@gmail.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55c82617