1. 17 Dec, 2018 4 commits
    • Christoph Paasch's avatar
      net: Prevent invalid access to skb->prev in __qdisc_drop_all · d7519c01
      Christoph Paasch authored
      [ Upstream commit 9410d386 ]
      
      __qdisc_drop_all() accesses skb->prev to get to the tail of the
      segment-list.
      
      With commit 68d2f84a ("net: gro: properly remove skb from list")
      the skb-list handling has been changed to set skb->next to NULL and set
      the list-poison on skb->prev.
      
      With that change, __qdisc_drop_all() will panic when it tries to
      dereference skb->prev.
      
      Since commit 992cba7e ("net: Add and use skb_list_del_init().")
      __list_del_entry is used, leaving skb->prev unchanged (thus,
      pointing to the list-head if it's the first skb of the list).
      This will make __qdisc_drop_all modify the next-pointer of the list-head
      and result in a panic later on:
      
      [   34.501053] general protection fault: 0000 [#1] SMP KASAN PTI
      [   34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108
      [   34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
      [   34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90
      [   34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04
      [   34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202
      [   34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6
      [   34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038
      [   34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062
      [   34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000
      [   34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008
      [   34.512082] FS:  0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000
      [   34.513036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0
      [   34.514593] Call Trace:
      [   34.514893]  <IRQ>
      [   34.515157]  napi_gro_receive+0x93/0x150
      [   34.515632]  receive_buf+0x893/0x3700
      [   34.516094]  ? __netif_receive_skb+0x1f/0x1a0
      [   34.516629]  ? virtnet_probe+0x1b40/0x1b40
      [   34.517153]  ? __stable_node_chain+0x4d0/0x850
      [   34.517684]  ? kfree+0x9a/0x180
      [   34.518067]  ? __kasan_slab_free+0x171/0x190
      [   34.518582]  ? detach_buf+0x1df/0x650
      [   34.519061]  ? lapic_next_event+0x5a/0x90
      [   34.519539]  ? virtqueue_get_buf_ctx+0x280/0x7f0
      [   34.520093]  virtnet_poll+0x2df/0xd60
      [   34.520533]  ? receive_buf+0x3700/0x3700
      [   34.521027]  ? qdisc_watchdog_schedule_ns+0xd5/0x140
      [   34.521631]  ? htb_dequeue+0x1817/0x25f0
      [   34.522107]  ? sch_direct_xmit+0x142/0xf30
      [   34.522595]  ? virtqueue_napi_schedule+0x26/0x30
      [   34.523155]  net_rx_action+0x2f6/0xc50
      [   34.523601]  ? napi_complete_done+0x2f0/0x2f0
      [   34.524126]  ? kasan_check_read+0x11/0x20
      [   34.524608]  ? _raw_spin_lock+0x7d/0xd0
      [   34.525070]  ? _raw_spin_lock_bh+0xd0/0xd0
      [   34.525563]  ? kvm_guest_apic_eoi_write+0x6b/0x80
      [   34.526130]  ? apic_ack_irq+0x9e/0xe0
      [   34.526567]  __do_softirq+0x188/0x4b5
      [   34.527015]  irq_exit+0x151/0x180
      [   34.527417]  do_IRQ+0xdb/0x150
      [   34.527783]  common_interrupt+0xf/0xf
      [   34.528223]  </IRQ>
      
      This patch makes sure that skb->prev is set to NULL when entering
      netem_enqueue.
      
      Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Fixes: 68d2f84a ("net: gro: properly remove skb from list")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7519c01
    • Heiner Kallweit's avatar
      net: phy: don't allow __set_phy_supported to add unsupported modes · a1ec658a
      Heiner Kallweit authored
      [ Upstream commit d2a36971 ]
      
      Currently __set_phy_supported allows to add modes w/o checking whether
      the PHY supports them. This is wrong, it should never add modes but
      only remove modes we don't want to support.
      
      The commit marked as fixed didn't do anything wrong, it just copied
      existing functionality to the helper which is being fixed now.
      
      Fixes: f3a6bd39 ("phylib: Add phy_set_max_speed helper")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1ec658a
    • Su Yanjun's avatar
      net: 8139cp: fix a BUG triggered by changing mtu with network traffic · 8832e330
      Su Yanjun authored
      [ Upstream commit a5d4a892 ]
      
      When changing mtu many times with traffic, a bug is triggered:
      
      [ 1035.684037] kernel BUG at lib/dynamic_queue_limits.c:26!
      [ 1035.684042] invalid opcode: 0000 [#1] SMP
      [ 1035.684049] Modules linked in: loop binfmt_misc 8139cp(OE) macsec
      tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag tcp_lp
      fuse uinput xt_CHECKSUM iptable_mangle ipt_MASQUERADE
      nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
      nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
      bridge stp llc ebtable_filter ebtables ip6table_filter devlink
      ip6_tables iptable_filter sunrpc snd_hda_codec_generic snd_hda_intel
      snd_hda_codec snd_hda_core snd_hwdep ppdev snd_seq iosf_mbi crc32_pclmul
      parport_pc snd_seq_device ghash_clmulni_intel parport snd_pcm
      aesni_intel joydev lrw snd_timer virtio_balloon sg gf128mul glue_helper
      ablk_helper cryptd snd soundcore i2c_piix4 pcspkr ip_tables xfs
      libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic ata_generic
      [ 1035.684102]  pata_acpi virtio_console qxl drm_kms_helper syscopyarea
      sysfillrect sysimgblt floppy fb_sys_fops crct10dif_pclmul
      crct10dif_common ttm crc32c_intel serio_raw ata_piix drm libata 8139too
      virtio_pci drm_panel_orientation_quirks virtio_ring virtio mii dm_mirror
      dm_region_hash dm_log dm_mod [last unloaded: 8139cp]
      [ 1035.684132] CPU: 9 PID: 25140 Comm: if-mtu-change Kdump: loaded
      Tainted: G           OE  ------------ T 3.10.0-957.el7.x86_64 #1
      [ 1035.684134] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 1035.684136] task: ffff8f59b1f5a080 ti: ffff8f5a2e32c000 task.ti:
      ffff8f5a2e32c000
      [ 1035.684149] RIP: 0010:[<ffffffffba3a40d0>]  [<ffffffffba3a40d0>]
      dql_completed+0x180/0x190
      [ 1035.684162] RSP: 0000:ffff8f5a75483e50  EFLAGS: 00010093
      [ 1035.684162] RAX: 00000000000000c2 RBX: ffff8f5a6f91c000 RCX:
      0000000000000000
      [ 1035.684162] RDX: 0000000000000000 RSI: 0000000000000184 RDI:
      ffff8f599fea3ec0
      [ 1035.684162] RBP: ffff8f5a75483ea8 R08: 00000000000000c2 R09:
      0000000000000000
      [ 1035.684162] R10: 00000000000616ef R11: ffff8f5a75483b56 R12:
      ffff8f599fea3e00
      [ 1035.684162] R13: 0000000000000001 R14: 0000000000000000 R15:
      0000000000000184
      [ 1035.684162] FS:  00007fa8434de740(0000) GS:ffff8f5a75480000(0000)
      knlGS:0000000000000000
      [ 1035.684162] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1035.684162] CR2: 00000000004305d0 CR3: 000000024eb66000 CR4:
      00000000001406e0
      [ 1035.684162] Call Trace:
      [ 1035.684162]  <IRQ>
      [ 1035.684162]  [<ffffffffc08cbaf8>] ? cp_interrupt+0x478/0x580 [8139cp]
      [ 1035.684162]  [<ffffffffba14a294>]
      __handle_irq_event_percpu+0x44/0x1c0
      [ 1035.684162]  [<ffffffffba14a442>] handle_irq_event_percpu+0x32/0x80
      [ 1035.684162]  [<ffffffffba14a4cc>] handle_irq_event+0x3c/0x60
      [ 1035.684162]  [<ffffffffba14db29>] handle_fasteoi_irq+0x59/0x110
      [ 1035.684162]  [<ffffffffba02e554>] handle_irq+0xe4/0x1a0
      [ 1035.684162]  [<ffffffffba7795dd>] do_IRQ+0x4d/0xf0
      [ 1035.684162]  [<ffffffffba76b362>] common_interrupt+0x162/0x162
      [ 1035.684162]  <EOI>
      [ 1035.684162]  [<ffffffffba0c2ae4>] ? __wake_up_bit+0x24/0x70
      [ 1035.684162]  [<ffffffffba1e46f5>] ? do_set_pte+0xd5/0x120
      [ 1035.684162]  [<ffffffffba1b64fb>] unlock_page+0x2b/0x30
      [ 1035.684162]  [<ffffffffba1e4879>] do_read_fault.isra.61+0x139/0x1b0
      [ 1035.684162]  [<ffffffffba1e9134>] handle_pte_fault+0x2f4/0xd10
      [ 1035.684162]  [<ffffffffba1ebc6d>] handle_mm_fault+0x39d/0x9b0
      [ 1035.684162]  [<ffffffffba76f5e3>] __do_page_fault+0x203/0x500
      [ 1035.684162]  [<ffffffffba76f9c6>] trace_do_page_fault+0x56/0x150
      [ 1035.684162]  [<ffffffffba76ef42>] do_async_page_fault+0x22/0xf0
      [ 1035.684162]  [<ffffffffba76b788>] async_page_fault+0x28/0x30
      [ 1035.684162] Code: 54 c7 47 54 ff ff ff ff 44 0f 49 ce 48 8b 35 48 2f
      9c 00 48 89 77 58 e9 fe fe ff ff 0f 1f 80 00 00 00 00 41 89 d1 e9 ef fe
      ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 8d 42 ff 48
      [ 1035.684162] RIP  [<ffffffffba3a40d0>] dql_completed+0x180/0x190
      [ 1035.684162]  RSP <ffff8f5a75483e50>
      
      It's not the same as in 7fe0ee09 patch described.
      As 8139cp uses shared irq mode, other device irq will trigger
      cp_interrupt to execute.
      
      cp_change_mtu
       -> cp_close
       -> cp_open
      
      In cp_close routine  just before free_irq(), some interrupt may occur.
      In my environment, cp_interrupt exectutes and IntrStatus is 0x4,
      exactly TxOk. That will cause cp_tx to wake device queue.
      
      As device queue is started, cp_start_xmit and cp_open will run at same
      time which will cause kernel BUG.
      
      For example:
      [#] for tx descriptor
      
      At start:
      
      [#][#][#]
      num_queued=3
      
      After cp_init_hw->cp_start_hw->netdev_reset_queue:
      
      [#][#][#]
      num_queued=0
      
      When 8139cp starts to work then cp_tx will check
      num_queued mismatchs the complete_bytes.
      
      The patch will check IntrMask before check IntrStatus in cp_interrupt.
      When 8139cp interrupt is disabled, just return.
      Signed-off-by: default avatarSu Yanjun <suyj.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8832e330
    • Stefano Brivio's avatar
      ipv6: Check available headroom in ip6_xmit() even without options · 298e1144
      Stefano Brivio authored
      [ Upstream commit 66033f47 ]
      
      Even if we send an IPv6 packet without options, MAX_HEADER might not be
      enough to account for the additional headroom required by alignment of
      hardware headers.
      
      On a configuration without HYPERV_NET, WLAN, AX25, and with IPV6_TUNNEL,
      sending short SCTP packets over IPv4 over L2TP over IPv6, we start with
      100 bytes of allocated headroom in sctp_packet_transmit(), end up with 54
      bytes after l2tp_xmit_skb(), and 14 bytes in ip6_finish_output2().
      
      Those would be enough to append our 14 bytes header, but we're going to
      align that to 16 bytes, and write 2 bytes out of the allocated slab in
      neigh_hh_output().
      
      KASan says:
      
      [  264.967848] ==================================================================
      [  264.967861] BUG: KASAN: slab-out-of-bounds in ip6_finish_output2+0x1aec/0x1c70
      [  264.967866] Write of size 16 at addr 000000006af1c7fe by task netperf/6201
      [  264.967870]
      [  264.967876] CPU: 0 PID: 6201 Comm: netperf Not tainted 4.20.0-rc4+ #1
      [  264.967881] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0)
      [  264.967887] Call Trace:
      [  264.967896] ([<00000000001347d6>] show_stack+0x56/0xa0)
      [  264.967903]  [<00000000017e379c>] dump_stack+0x23c/0x290
      [  264.967912]  [<00000000007bc594>] print_address_description+0xf4/0x290
      [  264.967919]  [<00000000007bc8fc>] kasan_report+0x13c/0x240
      [  264.967927]  [<000000000162f5e4>] ip6_finish_output2+0x1aec/0x1c70
      [  264.967935]  [<000000000163f890>] ip6_finish_output+0x430/0x7f0
      [  264.967943]  [<000000000163fe44>] ip6_output+0x1f4/0x580
      [  264.967953]  [<000000000163882a>] ip6_xmit+0xfea/0x1ce8
      [  264.967963]  [<00000000017396e2>] inet6_csk_xmit+0x282/0x3f8
      [  264.968033]  [<000003ff805fb0ba>] l2tp_xmit_skb+0xe02/0x13e0 [l2tp_core]
      [  264.968037]  [<000003ff80631192>] l2tp_eth_dev_xmit+0xda/0x150 [l2tp_eth]
      [  264.968041]  [<0000000001220020>] dev_hard_start_xmit+0x268/0x928
      [  264.968069]  [<0000000001330e8e>] sch_direct_xmit+0x7ae/0x1350
      [  264.968071]  [<000000000122359c>] __dev_queue_xmit+0x2b7c/0x3478
      [  264.968075]  [<00000000013d2862>] ip_finish_output2+0xce2/0x11a0
      [  264.968078]  [<00000000013d9b14>] ip_finish_output+0x56c/0x8c8
      [  264.968081]  [<00000000013ddd1e>] ip_output+0x226/0x4c0
      [  264.968083]  [<00000000013dbd6c>] __ip_queue_xmit+0x894/0x1938
      [  264.968100]  [<000003ff80bc3a5c>] sctp_packet_transmit+0x29d4/0x3648 [sctp]
      [  264.968116]  [<000003ff80b7bf68>] sctp_outq_flush_ctrl.constprop.5+0x8d0/0xe50 [sctp]
      [  264.968131]  [<000003ff80b7c716>] sctp_outq_flush+0x22e/0x7d8 [sctp]
      [  264.968146]  [<000003ff80b35c68>] sctp_cmd_interpreter.isra.16+0x530/0x6800 [sctp]
      [  264.968161]  [<000003ff80b3410a>] sctp_do_sm+0x222/0x648 [sctp]
      [  264.968177]  [<000003ff80bbddac>] sctp_primitive_ASSOCIATE+0xbc/0xf8 [sctp]
      [  264.968192]  [<000003ff80b93328>] __sctp_connect+0x830/0xc20 [sctp]
      [  264.968208]  [<000003ff80bb11ce>] sctp_inet_connect+0x2e6/0x378 [sctp]
      [  264.968212]  [<0000000001197942>] __sys_connect+0x21a/0x450
      [  264.968215]  [<000000000119aff8>] sys_socketcall+0x3d0/0xb08
      [  264.968218]  [<000000000184ea7a>] system_call+0x2a2/0x2c0
      
      [...]
      
      Just like ip_finish_output2() does for IPv4, check that we have enough
      headroom in ip6_xmit(), and reallocate it if we don't.
      
      This issue is older than git history.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      298e1144
  2. 13 Dec, 2018 36 commits