1. 20 Apr, 2016 40 commits
    • Eric Dumazet's avatar
      ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates · 26dd42eb
      Eric Dumazet authored
      [ Upstream commit 2d421226 ]
      
      IPv6 counters updates use a different macro than IPv4.
      
      Fixes: 36cbb245 ("udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicasts")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Rick Jones <rick.jones2@hp.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26dd42eb
    • Bjørn Mork's avatar
      qmi_wwan: add "D-Link DWM-221 B1" device id · 9603d0a5
      Bjørn Mork authored
      [ Upstream commit e84810c7 ]
      
      Thomas reports:
      "Windows:
      
      00 diagnostics
      01 modem
      02 at-port
      03 nmea
      04 nic
      
      Linux:
      
      T:  Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=2001 ProdID=7e19 Rev=02.32
      S:  Manufacturer=Mobile Connect
      S:  Product=Mobile Connect
      S:  SerialNumber=0123456789ABCDEF
      C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      I:  If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage"
      Reported-by: default avatarThomas Schäfer <tschaefer@t-online.de>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9603d0a5
    • subashab@codeaurora.org's avatar
      xfrm: Fix crash observed during device unregistration and decryption · 759e8f38
      subashab@codeaurora.org authored
      [ Upstream commit 071d36bf ]
      
      A crash is observed when a decrypted packet is processed in receive
      path. get_rps_cpus() tries to dereference the skb->dev fields but it
      appears that the device is freed from the poison pattern.
      
      [<ffffffc000af58ec>] get_rps_cpu+0x94/0x2f0
      [<ffffffc000af5f94>] netif_rx_internal+0x140/0x1cc
      [<ffffffc000af6094>] netif_rx+0x74/0x94
      [<ffffffc000bc0b6c>] xfrm_input+0x754/0x7d0
      [<ffffffc000bc0bf8>] xfrm_input_resume+0x10/0x1c
      [<ffffffc000ba6eb8>] esp_input_done+0x20/0x30
      [<ffffffc0000b64c8>] process_one_work+0x244/0x3fc
      [<ffffffc0000b7324>] worker_thread+0x2f8/0x418
      [<ffffffc0000bb40c>] kthread+0xe0/0xec
      
      -013|get_rps_cpu(
           |    dev = 0xFFFFFFC08B688000,
           |    skb = 0xFFFFFFC0C76AAC00 -> (
           |      dev = 0xFFFFFFC08B688000 -> (
           |        name =
      "......................................................
           |        name_hlist = (next = 0xAAAAAAAAAAAAAAAA, pprev =
      0xAAAAAAAAAAA
      
      Following are the sequence of events observed -
      
      - Encrypted packet in receive path from netdevice is queued
      - Encrypted packet queued for decryption (asynchronous)
      - Netdevice brought down and freed
      - Packet is decrypted and returned through callback in esp_input_done
      - Packet is queued again for process in network stack using netif_rx
      
      Since the device appears to have been freed, the dereference of
      skb->dev in get_rps_cpus() leads to an unhandled page fault
      exception.
      
      Fix this by holding on to device reference when queueing packets
      asynchronously and releasing the reference on call back return.
      
      v2: Make the change generic to xfrm as mentioned by Steffen and
      update the title to xfrm
      Suggested-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarJerome Stanislaus <jeromes@codeaurora.org>
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      759e8f38
    • Guillaume Nault's avatar
      ppp: take reference on channels netns · 046ea818
      Guillaume Nault authored
      [ Upstream commit 1f461dcd ]
      
      Let channels hold a reference on their network namespace.
      Some channel types, like ppp_async and ppp_synctty, can have their
      userspace controller running in a different namespace. Therefore they
      can't rely on them to preclude their netns from being removed from
      under them.
      
      ==================================================================
      BUG: KASAN: use-after-free in ppp_unregister_channel+0x372/0x3a0 at
      addr ffff880064e217e0
      Read of size 8 by task syz-executor/11581
      =============================================================================
      BUG net_namespace (Not tainted): kasan: bad access detected
      -----------------------------------------------------------------------------
      
      Disabling lock debugging due to kernel taint
      INFO: Allocated in copy_net_ns+0x6b/0x1a0 age=92569 cpu=3 pid=6906
      [<      none      >] ___slab_alloc+0x4c7/0x500 kernel/mm/slub.c:2440
      [<      none      >] __slab_alloc+0x4c/0x90 kernel/mm/slub.c:2469
      [<     inline     >] slab_alloc_node kernel/mm/slub.c:2532
      [<     inline     >] slab_alloc kernel/mm/slub.c:2574
      [<      none      >] kmem_cache_alloc+0x23a/0x2b0 kernel/mm/slub.c:2579
      [<     inline     >] kmem_cache_zalloc kernel/include/linux/slab.h:597
      [<     inline     >] net_alloc kernel/net/core/net_namespace.c:325
      [<      none      >] copy_net_ns+0x6b/0x1a0 kernel/net/core/net_namespace.c:360
      [<      none      >] create_new_namespaces+0x2f6/0x610 kernel/kernel/nsproxy.c:95
      [<      none      >] copy_namespaces+0x297/0x320 kernel/kernel/nsproxy.c:150
      [<      none      >] copy_process.part.35+0x1bf4/0x5760 kernel/kernel/fork.c:1451
      [<     inline     >] copy_process kernel/kernel/fork.c:1274
      [<      none      >] _do_fork+0x1bc/0xcb0 kernel/kernel/fork.c:1723
      [<     inline     >] SYSC_clone kernel/kernel/fork.c:1832
      [<      none      >] SyS_clone+0x37/0x50 kernel/kernel/fork.c:1826
      [<      none      >] entry_SYSCALL_64_fastpath+0x16/0x7a kernel/arch/x86/entry/entry_64.S:185
      
      INFO: Freed in net_drop_ns+0x67/0x80 age=575 cpu=2 pid=2631
      [<      none      >] __slab_free+0x1fc/0x320 kernel/mm/slub.c:2650
      [<     inline     >] slab_free kernel/mm/slub.c:2805
      [<      none      >] kmem_cache_free+0x2a0/0x330 kernel/mm/slub.c:2814
      [<     inline     >] net_free kernel/net/core/net_namespace.c:341
      [<      none      >] net_drop_ns+0x67/0x80 kernel/net/core/net_namespace.c:348
      [<      none      >] cleanup_net+0x4e5/0x600 kernel/net/core/net_namespace.c:448
      [<      none      >] process_one_work+0x794/0x1440 kernel/kernel/workqueue.c:2036
      [<      none      >] worker_thread+0xdb/0xfc0 kernel/kernel/workqueue.c:2170
      [<      none      >] kthread+0x23f/0x2d0 kernel/drivers/block/aoe/aoecmd.c:1303
      [<      none      >] ret_from_fork+0x3f/0x70 kernel/arch/x86/entry/entry_64.S:468
      INFO: Slab 0xffffea0001938800 objects=3 used=0 fp=0xffff880064e20000
      flags=0x5fffc0000004080
      INFO: Object 0xffff880064e20000 @offset=0 fp=0xffff880064e24200
      
      CPU: 1 PID: 11581 Comm: syz-executor Tainted: G    B           4.4.0+
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
       00000000ffffffff ffff8800662c7790 ffffffff8292049d ffff88003e36a300
       ffff880064e20000 ffff880064e20000 ffff8800662c77c0 ffffffff816f2054
       ffff88003e36a300 ffffea0001938800 ffff880064e20000 0000000000000000
      Call Trace:
       [<     inline     >] __dump_stack kernel/lib/dump_stack.c:15
       [<ffffffff8292049d>] dump_stack+0x6f/0xa2 kernel/lib/dump_stack.c:50
       [<ffffffff816f2054>] print_trailer+0xf4/0x150 kernel/mm/slub.c:654
       [<ffffffff816f875f>] object_err+0x2f/0x40 kernel/mm/slub.c:661
       [<     inline     >] print_address_description kernel/mm/kasan/report.c:138
       [<ffffffff816fb0c5>] kasan_report_error+0x215/0x530 kernel/mm/kasan/report.c:236
       [<     inline     >] kasan_report kernel/mm/kasan/report.c:259
       [<ffffffff816fb4de>] __asan_report_load8_noabort+0x3e/0x40 kernel/mm/kasan/report.c:280
       [<     inline     >] ? ppp_pernet kernel/include/linux/compiler.h:218
       [<ffffffff83ad71b2>] ? ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392
       [<     inline     >] ppp_pernet kernel/include/linux/compiler.h:218
       [<ffffffff83ad71b2>] ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392
       [<     inline     >] ? ppp_pernet kernel/drivers/net/ppp/ppp_generic.c:293
       [<ffffffff83ad6f26>] ? ppp_unregister_channel+0xe6/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392
       [<ffffffff83ae18f3>] ppp_asynctty_close+0xa3/0x130 kernel/drivers/net/ppp/ppp_async.c:241
       [<ffffffff83ae1850>] ? async_lcp_peek+0x5b0/0x5b0 kernel/drivers/net/ppp/ppp_async.c:1000
       [<ffffffff82c33239>] tty_ldisc_close.isra.1+0x99/0xe0 kernel/drivers/tty/tty_ldisc.c:478
       [<ffffffff82c332c0>] tty_ldisc_kill+0x40/0x170 kernel/drivers/tty/tty_ldisc.c:744
       [<ffffffff82c34943>] tty_ldisc_release+0x1b3/0x260 kernel/drivers/tty/tty_ldisc.c:772
       [<ffffffff82c1ef21>] tty_release+0xac1/0x13e0 kernel/drivers/tty/tty_io.c:1901
       [<ffffffff82c1e460>] ? release_tty+0x320/0x320 kernel/drivers/tty/tty_io.c:1688
       [<ffffffff8174de36>] __fput+0x236/0x780 kernel/fs/file_table.c:208
       [<ffffffff8174e405>] ____fput+0x15/0x20 kernel/fs/file_table.c:244
       [<ffffffff813595ab>] task_work_run+0x16b/0x200 kernel/kernel/task_work.c:115
       [<     inline     >] exit_task_work kernel/include/linux/task_work.h:21
       [<ffffffff81307105>] do_exit+0x8b5/0x2c60 kernel/kernel/exit.c:750
       [<ffffffff813fdd20>] ? debug_check_no_locks_freed+0x290/0x290 kernel/kernel/locking/lockdep.c:4123
       [<ffffffff81306850>] ? mm_update_next_owner+0x6f0/0x6f0 kernel/kernel/exit.c:357
       [<ffffffff813215e6>] ? __dequeue_signal+0x136/0x470 kernel/kernel/signal.c:550
       [<ffffffff8132067b>] ? recalc_sigpending_tsk+0x13b/0x180 kernel/kernel/signal.c:145
       [<ffffffff81309628>] do_group_exit+0x108/0x330 kernel/kernel/exit.c:880
       [<ffffffff8132b9d4>] get_signal+0x5e4/0x14f0 kernel/kernel/signal.c:2307
       [<     inline     >] ? kretprobe_table_lock kernel/kernel/kprobes.c:1113
       [<ffffffff8151d355>] ? kprobe_flush_task+0xb5/0x450 kernel/kernel/kprobes.c:1158
       [<ffffffff8115f7d3>] do_signal+0x83/0x1c90 kernel/arch/x86/kernel/signal.c:712
       [<ffffffff8151d2a0>] ? recycle_rp_inst+0x310/0x310 kernel/include/linux/list.h:655
       [<ffffffff8115f750>] ? setup_sigcontext+0x780/0x780 kernel/arch/x86/kernel/signal.c:165
       [<ffffffff81380864>] ? finish_task_switch+0x424/0x5f0 kernel/kernel/sched/core.c:2692
       [<     inline     >] ? finish_lock_switch kernel/kernel/sched/sched.h:1099
       [<ffffffff81380560>] ? finish_task_switch+0x120/0x5f0 kernel/kernel/sched/core.c:2678
       [<     inline     >] ? context_switch kernel/kernel/sched/core.c:2807
       [<ffffffff85d794e9>] ? __schedule+0x919/0x1bd0 kernel/kernel/sched/core.c:3283
       [<ffffffff81003901>] exit_to_usermode_loop+0xf1/0x1a0 kernel/arch/x86/entry/common.c:247
       [<     inline     >] prepare_exit_to_usermode kernel/arch/x86/entry/common.c:282
       [<ffffffff810062ef>] syscall_return_slowpath+0x19f/0x210 kernel/arch/x86/entry/common.c:344
       [<ffffffff85d88022>] int_ret_from_sys_call+0x25/0x9f kernel/arch/x86/entry/entry_64.S:281
      Memory state around the buggy address:
       ffff880064e21680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff880064e21700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff880064e21780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                             ^
       ffff880064e21800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff880064e21880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      
      Fixes: 273ec51d ("net: ppp_generic - introduce net-namespace functionality v2")
      Reported-by: default avatarBaozeng Ding <sploving1@gmail.com>
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Reviewed-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      046ea818
    • Lance Richardson's avatar
      ipv4: initialize flowi4_flags before calling fib_lookup() · 80de2e41
      Lance Richardson authored
      [ Upstream commit 4cfc86f3 ]
      
      Field fl4.flowi4_flags is not initialized in fib_compute_spec_dst()
      before calling fib_lookup(), which means fib_table_lookup() is
      using non-deterministic data at this line:
      
      	if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {
      
      Fix by initializing the entire fl4 structure, which will prevent
      similar issues as fields are added in the future by ensuring that
      all fields are initialized to zero unless explicitly initialized
      to another value.
      
      Fixes: 58189ca7 ("net: Fix vti use case with oif in dst lookups")
      Suggested-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarLance Richardson <lrichard@redhat.com>
      Acked-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80de2e41
    • Paolo Abeni's avatar
      ipv4: fix broadcast packets reception · 2ddb1813
      Paolo Abeni authored
      [ Upstream commit ad0ea198 ]
      
      Currently, ingress ipv4 broadcast datagrams are dropped since,
      in udp_v4_early_demux(), ip_check_mc_rcu() is invoked even on
      bcast packets.
      
      This patch addresses the issue, invoking ip_check_mc_rcu()
      only for mcast packets.
      
      Fixes: 6e540309 ("ipv4/udp: Verify multicast group is ours in upd_v4_early_demux()")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ddb1813
    • Eric Dumazet's avatar
      bonding: fix bond_get_stats() · 8178211e
      Eric Dumazet authored
      [ Upstream commit fe30937b ]
      
      bond_get_stats() can be called from rtnetlink (with RTNL held)
      or from /proc/net/dev seq handler (with RCU held)
      
      The logic added in commit 5f0c5f73 ("bonding: make global bonding
      stats more reliable") kind of assumed only one cpu could run there.
      
      If multiple threads are reading /proc/net/dev, stats can be really
      messed up after a while.
      
      A second problem is that some fields are 32bit, so we need to properly
      handle the wrap around problem.
      
      Given that RTNL is not always held, we need to use
      bond_for_each_slave_rcu().
      
      Fixes: 5f0c5f73 ("bonding: make global bonding stats more reliable")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Reviewed-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8178211e
    • Eric Dumazet's avatar
      net: bcmgenet: fix dma api length mismatch · 7a0e9a08
      Eric Dumazet authored
      [ Upstream commit eee57723 ]
      
      When un-mapping skb->data in __bcmgenet_tx_reclaim(),
      we must use the length that was used in original dma_map_single(),
      instead of skb->len that might be bigger (includes the frags)
      
      We simply can store skb_len into tx_cb_ptr->dma_len and use it
      at unmap time.
      
      Fixes: 1c1008c7 ("net: bcmgenet: add main driver file")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a0e9a08
    • Manish Chopra's avatar
      qlge: Fix receive packets drop. · a5ce25f6
      Manish Chopra authored
      [ Upstream commit 2c9a266a ]
      
      When running small packets [length < 256 bytes] traffic, packets were
      being dropped due to invalid data in those packets which were
      delivered by the driver upto the stack. Using pci_dma_sync_single_for_cpu
      ensures copying latest and updated data into skb from the receive buffer.
      Signed-off-by: default avatarSony Chacko <sony.chacko@qlogic.com>
      Signed-off-by: default avatarManish Chopra <manish.chopra@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5ce25f6
    • Eric Dumazet's avatar
      tcp/dccp: remove obsolete WARN_ON() in icmp handlers · bd33d14a
      Eric Dumazet authored
      [ Upstream commit e316ea62 ]
      
      Now SYN_RECV request sockets are installed in ehash table, an ICMP
      handler can find a request socket while another cpu handles an incoming
      packet transforming this SYN_RECV request socket into an ESTABLISHED
      socket.
      
      We need to remove the now obsolete WARN_ON(req->sk), since req->sk
      is set when a new child is created and added into listener accept queue.
      
      If this race happens, the ICMP will do nothing special.
      
      Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarBen Lazarus <blazarus@google.com>
      Reported-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd33d14a
    • Guillaume Nault's avatar
      ppp: ensure file->private_data can't be overridden · 029464a3
      Guillaume Nault authored
      [ Upstream commit e8e56ffd ]
      
      Locking ppp_mutex must be done before dereferencing file->private_data,
      otherwise it could be modified before ppp_unattached_ioctl() takes the
      lock. This could lead ppp_unattached_ioctl() to override ->private_data,
      thus leaking reference to the ppp_file previously pointed to.
      
      v2: lock all ppp_ioctl() instead of just checking private_data in
          ppp_unattached_ioctl(), to avoid ambiguous behaviour.
      
      Fixes: f3ff8a4d ("ppp: push BKL down into the driver")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      029464a3
    • Arnd Bergmann's avatar
      ath9k: fix buffer overrun for ar9287 · a317579b
      Arnd Bergmann authored
      [ Upstream commit 83d6f1f1 ]
      
      Code that was added back in 2.6.38 has an obvious overflow
      when accessing a static array, and at the time it was added
      only a code comment was put in front of it as a reminder
      to have it reviewed properly.
      
      This has not happened, but gcc-6 now points to the specific
      overflow:
      
      drivers/net/wireless/ath/ath9k/eeprom.c: In function 'ath9k_hw_get_gain_boundaries_pdadcs':
      drivers/net/wireless/ath/ath9k/eeprom.c:483:44: error: array subscript is above array bounds [-Werror=array-bounds]
           maxPwrT4[i] = data_9287[idxL].pwrPdg[i][4];
                         ~~~~~~~~~~~~~~~~~~~~~~~~~^~~
      
      It turns out that the correct array length exists in the local
      'intercepts' variable of this function, so we can just use that
      instead of hardcoding '4', so this patch changes all three
      instances to use that variable. The other two instances were
      already correct, but it's more consistent this way.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 940cd2c1 ("ath9k_hw: merge the ar9287 version of ath9k_hw_get_gain_boundaries_pdadcs")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a317579b
    • Arnd Bergmann's avatar
      farsync: fix off-by-one bug in fst_add_one · 6e6ede49
      Arnd Bergmann authored
      [ Upstream commit e725a66c ]
      
      gcc-6 finds an out of bounds access in the fst_add_one function
      when calculating the end of the mmio area:
      
      drivers/net/wan/farsync.c: In function 'fst_add_one':
      drivers/net/wan/farsync.c:418:53: error: index 2 denotes an offset greater than size of 'u8[2][8192] {aka unsigned char[2][8192]}' [-Werror=array-bounds]
       #define BUF_OFFSET(X)   (BFM_BASE + offsetof(struct buf_window, X))
                                                           ^
      include/linux/compiler-gcc.h:158:21: note: in definition of macro '__compiler_offsetof'
        __builtin_offsetof(a, b)
                           ^
      drivers/net/wan/farsync.c:418:37: note: in expansion of macro 'offsetof'
       #define BUF_OFFSET(X)   (BFM_BASE + offsetof(struct buf_window, X))
                                           ^~~~~~~~
      drivers/net/wan/farsync.c:2519:36: note: in expansion of macro 'BUF_OFFSET'
                                        + BUF_OFFSET ( txBuffer[i][NUM_TX_BUFFER][0]);
                                          ^~~~~~~~~~
      
      The warning is correct, but not critical because this appears
      to be a write-only variable that is set by each WAN driver but
      never accessed afterwards.
      
      I'm taking the minimal fix here, using the correct pointer by
      pointing 'mem_end' to the last byte inside of the register area
      as all other WAN drivers do, rather than the first byte outside of
      it. An alternative would be to just remove the mem_end member
      entirely.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e6ede49
    • Arnd Bergmann's avatar
      mlx4: add missing braces in verify_qp_parameters · 13684fe9
      Arnd Bergmann authored
      [ Upstream commit baefd701 ]
      
      The implementation of QP paravirtualization back in linux-3.7 included
      some code that looks very dubious, and gcc-6 has grown smart enough
      to warn about it:
      
      drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'verify_qp_parameters':
      drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3154:5: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]
           if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
           ^~
      drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3144:4: note: ...this 'if' clause, but it is not
          if (slave != mlx4_master_func_num(dev))
      
      >From looking at the context, I'm reasonably sure that the indentation
      is correct but that it should have contained curly braces from the
      start, as the update_gid() function in the same patch correctly does.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 54679e14 ("mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13684fe9
    • Arnaldo Carvalho de Melo's avatar
      net: Fix use after free in the recvmmsg exit path · 405f10a3
      Arnaldo Carvalho de Melo authored
      [ Upstream commit 34b88a68 ]
      
      The syzkaller fuzzer hit the following use-after-free:
      
        Call Trace:
         [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295
         [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
         [<     inline     >] SYSC_recvmmsg net/socket.c:2281
         [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
         [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
        arch/x86/entry/entry_64.S:185
      
      And, as Dmitry rightly assessed, that is because we can drop the
      reference and then touch it when the underlying recvmsg calls return
      some packets and then hit an error, which will make recvmmsg to set
      sock->sk->sk_err, oops, fix it.
      Reported-and-Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Fixes: a2e27255 ("net: Introduce recvmmsg socket syscall")
      http://lkml.kernel.org/r/20160122211644.GC2470@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      405f10a3
    • David S. Miller's avatar
      ipv4: Don't do expensive useless work during inetdev destroy. · 54789759
      David S. Miller authored
      [ Upstream commit fbd40ea0 ]
      
      When an inetdev is destroyed, every address assigned to the interface
      is removed.  And in this scenerio we do two pointless things which can
      be very expensive if the number of assigned interfaces is large:
      
      1) Address promotion.  We are deleting all addresses, so there is no
         point in doing this.
      
      2) A full nf conntrack table purge for every address.  We only need to
         do this once, as is already caught by the existing
         masq_dev_notifier so masq_inet_event() can skip this.
      Reported-by: default avatarSolar Designer <solar@openwall.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Tested-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54789759
    • Stephen Hemminger's avatar
      bridge: allow zero ageing time · acbea202
      Stephen Hemminger authored
      [ Upstream commit 4c656c13 ]
      
      This fixes a regression in the bridge ageing time caused by:
      commit c62987bb ("bridge: push bridge setting ageing_time down to switchdev")
      
      There are users of Linux bridge which use the feature that if ageing time
      is set to 0 it causes entries to never expire. See:
        https://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
      
      For a pure software bridge, it is unnecessary for the code to have
      arbitrary restrictions on what values are allowable.
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      acbea202
    • Ido Schimmel's avatar
      rocker: set FDB cleanup timer according to lowest ageing time · c3d8f507
      Ido Schimmel authored
      [ Upstream commit 88de1cd4 ]
      
      In rocker, ageing time is a per-port attribute, so the next time the FDB
      cleanup timer fires should be set according to the lowest ageing time.
      
      This will later allow us to delete the BR_MIN_AGEING_TIME macro, which was
      added to guarantee minimum ageing time in the bridge layer, thereby breaking
      existing behavior.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3d8f507
    • Ido Schimmel's avatar
      mlxsw: spectrum: Check requested ageing time is valid · 7d870cff
      Ido Schimmel authored
      [ Upstream commit 869f63a4 ]
      
      Commit c62987bb ("bridge: push bridge setting ageing_time down to
      switchdev") added a check for minimum and maximum ageing time, but this
      breaks existing behaviour where one can set ageing time to 0 for a
      non-learning bridge.
      
      Push this check down to the driver and allow the check in the bridge
      layer to be removed. Currently ageing time 0 is refused by the driver,
      but we can later add support for this functionality.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d870cff
    • Willem de Bruijn's avatar
      macvtap: always pass ethernet header in linear · a96f3553
      Willem de Bruijn authored
      [ Upstream commit 8e2ad411 ]
      
      The stack expects link layer headers in the skb linear section.
      Macvtap can create skbs with llheader in frags in edge cases:
      when (IFF_VNET_HDR is off or vnet_hdr.hdr_len < ETH_HLEN) and
      prepad + len > PAGE_SIZE and vnet_hdr.flags has no or bad csum.
      
      Add checks to ensure linear is always at least ETH_HLEN.
      At this point, len is already ensured to be >= ETH_HLEN.
      
      For backwards compatiblity, rounds up short vnet_hdr.hdr_len.
      This differs from tap and packet, which return an error.
      
      Fixes b9fb9ee0 ("macvtap: add GSO/csum offload support")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a96f3553
    • Rajesh Borundia's avatar
      qlcnic: Fix mailbox completion handling during spurious interrupt · b39af5aa
      Rajesh Borundia authored
      [ Upstream commit 819bfe76 ]
      
      o While the driver is in the middle of a MB completion processing
      and it receives a spurious MB interrupt, it is mistaken as a good MB
      completion interrupt leading to premature completion of the next MB
      request. Fix the driver to guard against this by checking the current
      state of MB processing and ignore the spurious interrupt.
      Also added a stats counter to record this condition.
      Signed-off-by: default avatarRajesh Borundia <rajesh.borundia@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b39af5aa
    • Rajesh Borundia's avatar
      qlcnic: Remove unnecessary usage of atomic_t · 12dd6d86
      Rajesh Borundia authored
      [ Upstream commit 5bf93251 ]
      
      o atomic_t usage is incorrect as we are not implementing
      any atomicity.
      Signed-off-by: default avatarRajesh Borundia <rajesh.borundia@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12dd6d86
    • Sergei Shtylyov's avatar
      sh_eth: advance 'rxdesc' later in sh_eth_ring_format() · 8352a292
      Sergei Shtylyov authored
      [ Upstream commit d0ba9134 ]
      
      Iff dma_map_single() fails, 'rxdesc'  should point  to the last filled RX
      descriptor, so  that it can be marked as the last one, however the driver
      would have  already  advanced it by that time. In order to fix that, only
      fill  an RX descriptor  once all the data for it is ready.
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8352a292
    • Sergei Shtylyov's avatar
      sh_eth: fix NULL pointer dereference in sh_eth_ring_format() · a95fc0f7
      Sergei Shtylyov authored
      [ Upstream commit c1b7fca6 ]
      
      In a low memory situation, if netdev_alloc_skb() fails on a first RX ring
      loop iteration  in sh_eth_ring_format(), 'rxdesc' is still NULL.  Avoid
      kernel oops by adding the 'rxdesc' check after the loop.
      Reported-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a95fc0f7
    • Alexei Starovoitov's avatar
      bpf: avoid copying junk bytes in bpf_get_current_comm() · e8e43232
      Alexei Starovoitov authored
      [ Upstream commit cdc4e47d ]
      
      Lots of places in the kernel use memcpy(buf, comm, TASK_COMM_LEN); but
      the result is typically passed to print("%s", buf) and extra bytes
      after zero don't cause any harm.
      In bpf the result of bpf_get_current_comm() is used as the part of
      map key and was causing spurious hash map mismatches.
      Use strlcpy() to guarantee zero-terminated string.
      bpf verifier checks that output buffer is zero-initialized,
      so even for short task names the output buffer don't have junk bytes.
      Note it's not a security concern, since kprobe+bpf is root only.
      
      Fixes: ffeedafb ("bpf: introduce current->pid, tgid, uid, gid, comm accessors")
      Reported-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8e43232
    • Willem de Bruijn's avatar
      packet: validate variable length ll headers · edb60bc7
      Willem de Bruijn authored
      [ Upstream commit 9ed988cd ]
      
      Replace link layer header validation check ll_header_truncate with
      more generic dev_validate_header.
      
      Validation based on hard_header_len incorrectly drops valid packets
      in variable length protocols, such as AX25. dev_validate_header
      calls header_ops.validate for such protocols to ensure correctness
      below hard_header_len.
      
      See also http://comments.gmane.org/gmane.linux.network/401064
      
      Fixes 9c707762 ("packet: make packet_snd fail on len smaller than l2 header")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      edb60bc7
    • Willem de Bruijn's avatar
      ax25: add link layer header validation function · abd42587
      Willem de Bruijn authored
      [ Upstream commit ea47781c ]
      
      As variable length protocol, AX25 fails link layer header validation
      tests based on a minimum length. header_ops.validate allows protocols
      to validate headers that are shorter than hard_header_len. Implement
      this callback for AX25.
      
      See also http://comments.gmane.org/gmane.linux.network/401064Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abd42587
    • Willem de Bruijn's avatar
      net: validate variable length ll headers · 8b8d278a
      Willem de Bruijn authored
      [ Upstream commit 2793a23a ]
      
      Netdevice parameter hard_header_len is variously interpreted both as
      an upper and lower bound on link layer header length. The field is
      used as upper bound when reserving room at allocation, as lower bound
      when validating user input in PF_PACKET.
      
      Clarify the definition to be maximum header length. For validation
      of untrusted headers, add an optional validate member to header_ops.
      
      Allow bypassing of validation by passing CAP_SYS_RAWIO, for instance
      for deliberate testing of corrupt input. In this case, pad trailing
      bytes, as some device drivers expect completely initialized headers.
      
      See also http://comments.gmane.org/gmane.linux.network/401064Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b8d278a
    • Guillaume Nault's avatar
      ppp: release rtnl mutex when interface creation fails · cd8101d8
      Guillaume Nault authored
      [ Upstream commit 6faac63a ]
      
      Add missing rtnl_unlock() in the error path of ppp_create_interface().
      
      Fixes: 58a89eca ("ppp: fix lockdep splat in ppp_dev_uninit()")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd8101d8
    • Eric Dumazet's avatar
      tcp: fix tcpi_segs_in after connection establishment · 36b9c7cc
      Eric Dumazet authored
      [ Upstream commit a9d99ce2 ]
      
      If final packet (ACK) of 3WHS is lost, it appears we do not properly
      account the following incoming segment into tcpi_segs_in
      
      While we are at it, starts segs_in with one, to count the SYN packet.
      
      We do not yet count number of SYN we received for a request sock, we
      might add this someday.
      
      packetdrill script showing proper behavior after fix :
      
      // Tests tcpi_segs_in when 3rd packet (ACK) of 3WHS is lost
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
         +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
         +0 bind(3, ..., ...) = 0
         +0 listen(3, 1) = 0
      
         +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop>
         +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
      +.020 < P. 1:1001(1000) ack 1 win 32792
      
         +0 accept(3, ..., ...) = 4
      
      +.000 %{ assert tcpi_segs_in == 2, 'tcpi_segs_in=%d' % tcpi_segs_in }%
      
      Fixes: 2efd055c ("tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36b9c7cc
    • Bill Sommerfeld's avatar
      udp6: fix UDP/IPv6 encap resubmit path · 8a2226c1
      Bill Sommerfeld authored
      [ Upstream commit 59dca1d8 ]
      
      IPv4 interprets a negative return value from a protocol handler as a
      request to redispatch to a new protocol.  In contrast, IPv6 interprets a
      negative value as an error, and interprets a positive value as a request
      for redispatch.
      
      UDP for IPv6 was unaware of this difference.  Change __udp6_lib_rcv() to
      return a positive value for redispatch.  Note that the socket's
      encap_rcv hook still needs to return a negative value to request
      dispatch, and in the case of IPv6 packets, adjust IP6CB(skb)->nhoff to
      identify the byte containing the next protocol.
      Signed-off-by: default avatarBill Sommerfeld <wsommerfeld@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a2226c1
    • Oliver Neukum's avatar
      usbnet: cleanup after bind() in probe() · 2d11623b
      Oliver Neukum authored
      [ Upstream commit 1666984c ]
      
      In case bind() works, but a later error forces bailing
      in probe() in error cases work and a timer may be scheduled.
      They must be killed. This fixes an error case related to
      the double free reported in
      http://www.spinics.net/lists/netdev/msg367669.html
      and needs to go on top of Linus' fix to cdc-ncm.
      Signed-off-by: default avatarOliver Neukum <ONeukum@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d11623b
    • Bjørn Mork's avatar
      cdc_ncm: toggle altsetting to force reset before setup · 3aaa64b6
      Bjørn Mork authored
      [ Upstream commit 48906f62 ]
      
      Some devices will silently fail setup unless they are reset first.
      This is necessary even if the data interface is already in
      altsetting 0, which it will be when the device is probed for the
      first time.  Briefly toggling the altsetting forces a function
      reset regardless of the initial state.
      
      This fixes a setup problem observed on a number of Huawei devices,
      appearing to operate in NTB-32 mode even if we explicitly set them
      to NTB-16 mode.
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3aaa64b6
    • Daniel Borkmann's avatar
      vxlan: fix missing options_len update on RX with collect metadata · 32cb6781
      Daniel Borkmann authored
      [ Upstream commit 4024fcf7 ]
      
      When signalling to metadata consumers that the metadata_dst entry
      carries additional GBP extension data for vxlan (TUNNEL_VXLAN_OPT),
      the dst's vxlan_metadata information is populated, but options_len
      is left to zero. F.e. in ovs, ovs_flow_key_extract() checks for
      options_len before extracting the data through ip_tunnel_info_opts_get().
      
      Geneve uses ip_tunnel_info_opts_set() helper in receive path, which
      sets options_len internally, vxlan however uses ip_tunnel_info_opts(),
      so when filling vxlan_metadata, we do need to update options_len.
      
      Fixes: 4c222798 ("ip-tunnel: Use API to access tunnel metadata options.")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32cb6781
    • Florian Westphal's avatar
      ipv6: re-enable fragment header matching in ipv6_find_hdr · b80398d9
      Florian Westphal authored
      [ Upstream commit 5d150a98 ]
      
      When ipv6_find_hdr is used to find a fragment header
      (caller specifies target NEXTHDR_FRAGMENT) we erronously return
      -ENOENT for all fragments with nonzero offset.
      
      Before commit 9195bb8e, when target was specified, we did not
      enter the exthdr walk loop as nexthdr == target so this used to work.
      
      Now we do (so we can skip empty route headers). When we then stumble upon
      a frag with nonzero frag_off we must return -ENOENT ("header not found")
      only if the caller did not specifically request NEXTHDR_FRAGMENT.
      
      This allows nfables exthdr expression to match ipv6 fragments, e.g. via
      
      nft add rule ip6 filter input frag frag-off gt 0
      
      Fixes: 9195bb8e ("ipv6: improve ipv6_find_hdr() to skip empty routing headers")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b80398d9
    • Bjørn Mork's avatar
      qmi_wwan: add Sierra Wireless EM74xx device ID · 242fab14
      Bjørn Mork authored
      [ Upstream commit bf13c94c ]
      
      The MC74xx and EM74xx modules use different IDs by default, according
      to the Lenovo EM7455 driver for Windows.
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      242fab14
    • Parthasarathy Bhuvaragan's avatar
      tipc: Revert "tipc: use existing sk_write_queue for outgoing packet chain" · 7da899ce
      Parthasarathy Bhuvaragan authored
      [ Upstream commit f214fc40 ]
      
      reverts commit 94153e36 ("tipc: use existing sk_write_queue for
      outgoing packet chain")
      
      In Commit 94153e36, we assume that we fill & empty the socket's
      sk_write_queue within the same lock_sock() session.
      
      This is not true if the link is congested. During congestion, the
      socket lock is released while we wait for the congestion to cease.
      This implementation causes a nullptr exception, if the user space
      program has several threads accessing the same socket descriptor.
      
      Consider two threads of the same program performing the following:
           Thread1                                  Thread2
      --------------------                    ----------------------
      Enter tipc_sendmsg()                    Enter tipc_sendmsg()
      lock_sock()                             lock_sock()
      Enter tipc_link_xmit(), ret=ELINKCONG   spin on socket lock..
      sk_wait_event()                             :
      release_sock()                          grab socket lock
          :                                   Enter tipc_link_xmit(), ret=0
          :                                   release_sock()
      Wakeup after congestion
      lock_sock()
      skb = skb_peek(pktchain);
      !! TIPC_SKB_CB(skb)->wakeup_pending = tsk->link_cong;
      
      In this case, the second thread transmits the buffers belonging to
      both thread1 and thread2 successfully. When the first thread wakeup
      after the congestion it assumes that the pktchain is intact and
      operates on the skb's in it, which leads to the following exception:
      
      [2102.439969] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
      [2102.440074] IP: [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
      [2102.440074] PGD 3fa3f067 PUD 3fa6b067 PMD 0
      [2102.440074] Oops: 0000 [#1] SMP
      [2102.440074] CPU: 2 PID: 244 Comm: sender Not tainted 3.12.28 #1
      [2102.440074] RIP: 0010:[<ffffffffa005f330>]  [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
      [...]
      [2102.440074] Call Trace:
      [2102.440074]  [<ffffffff8163f0b9>] ? schedule+0x29/0x70
      [2102.440074]  [<ffffffffa006a756>] ? tipc_node_unlock+0x46/0x170 [tipc]
      [2102.440074]  [<ffffffffa005f761>] tipc_link_xmit+0x51/0xf0 [tipc]
      [2102.440074]  [<ffffffffa006d8ae>] tipc_send_stream+0x11e/0x4f0 [tipc]
      [2102.440074]  [<ffffffff8106b150>] ? __wake_up_sync+0x20/0x20
      [2102.440074]  [<ffffffffa006dc9c>] tipc_send_packet+0x1c/0x20 [tipc]
      [2102.440074]  [<ffffffff81502478>] sock_sendmsg+0xa8/0xd0
      [2102.440074]  [<ffffffff81507895>] ? release_sock+0x145/0x170
      [2102.440074]  [<ffffffff815030d8>] ___sys_sendmsg+0x3d8/0x3e0
      [2102.440074]  [<ffffffff816426ae>] ? _raw_spin_unlock+0xe/0x10
      [2102.440074]  [<ffffffff81115c2a>] ? handle_mm_fault+0x6ca/0x9d0
      [2102.440074]  [<ffffffff8107dd65>] ? set_next_entity+0x85/0xa0
      [2102.440074]  [<ffffffff816426de>] ? _raw_spin_unlock_irq+0xe/0x20
      [2102.440074]  [<ffffffff8107463c>] ? finish_task_switch+0x5c/0xc0
      [2102.440074]  [<ffffffff8163ea8c>] ? __schedule+0x34c/0x950
      [2102.440074]  [<ffffffff81504e12>] __sys_sendmsg+0x42/0x80
      [2102.440074]  [<ffffffff81504e62>] SyS_sendmsg+0x12/0x20
      [2102.440074]  [<ffffffff8164aed2>] system_call_fastpath+0x16/0x1b
      
      In this commit, we maintain the skb list always in the stack.
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7da899ce
    • Benjamin Poirier's avatar
      mld, igmp: Fix reserved tailroom calculation · d9bbdcd8
      Benjamin Poirier authored
      [ Upstream commit 1837b2e2 ]
      
      The current reserved_tailroom calculation fails to take hlen and tlen into
      account.
      
      skb:
      [__hlen__|__data____________|__tlen___|__extra__]
      ^                                               ^
      head                                            skb_end_offset
      
      In this representation, hlen + data + tlen is the size passed to alloc_skb.
      "extra" is the extra space made available in __alloc_skb because of
      rounding up by kmalloc. We can reorder the representation like so:
      
      [__hlen__|__data____________|__extra__|__tlen___]
      ^                                               ^
      head                                            skb_end_offset
      
      The maximum space available for ip headers and payload without
      fragmentation is min(mtu, data + extra). Therefore,
      reserved_tailroom
      = data + extra + tlen - min(mtu, data + extra)
      = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
      = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)
      
      Compare the second line to the current expression:
      reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
      and we can see that hlen and tlen are not taken into account.
      
      The min() in the third line can be expanded into:
      if mtu < skb_tailroom - tlen:
      	reserved_tailroom = skb_tailroom - mtu
      else:
      	reserved_tailroom = tlen
      
      Depending on hlen, tlen, mtu and the number of multicast address records,
      the current code may output skbs that have less tailroom than
      dev->needed_tailroom or it may output more skbs than needed because not all
      space available is used.
      
      Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@suse.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9bbdcd8
    • Xin Long's avatar
      sctp: lack the check for ports in sctp_v6_cmp_addr · a87c6525
      Xin Long authored
      [ Upstream commit 40b4f0fd ]
      
      As the member .cmp_addr of sctp_af_inet6, sctp_v6_cmp_addr should also check
      the port of addresses, just like sctp_v4_cmp_addr, cause it's invoked by
      sctp_cmp_addr_exact().
      
      Now sctp_v6_cmp_addr just check the port when two addresses have different
      family, and lack the port check for two ipv6 addresses. that will make
      sctp_hash_cmp() cannot work well.
      
      so fix it by adding ports comparison in sctp_v6_cmp_addr().
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a87c6525
    • Linus Lüssing's avatar
      net: fix bridge multicast packet checksum validation · 44bc7d1b
      Linus Lüssing authored
      [ Upstream commit 9b368814 ]
      
      We need to update the skb->csum after pulling the skb, otherwise
      an unnecessary checksum (re)computation can ocure for IGMP/MLD packets
      in the bridge code. Additionally this fixes the following splats for
      network devices / bridge ports with support for and enabled RX checksum
      offloading:
      
      [...]
      [   43.986968] eth0: hw csum failure
      [   43.990344] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.4.0 #2
      [   43.996193] Hardware name: BCM2709
      [   43.999647] [<800204e0>] (unwind_backtrace) from [<8001cf14>] (show_stack+0x10/0x14)
      [   44.007432] [<8001cf14>] (show_stack) from [<801ab614>] (dump_stack+0x80/0x90)
      [   44.014695] [<801ab614>] (dump_stack) from [<802e4548>] (__skb_checksum_complete+0x6c/0xac)
      [   44.023090] [<802e4548>] (__skb_checksum_complete) from [<803a055c>] (ipv6_mc_validate_checksum+0x104/0x178)
      [   44.032959] [<803a055c>] (ipv6_mc_validate_checksum) from [<802e111c>] (skb_checksum_trimmed+0x130/0x188)
      [   44.042565] [<802e111c>] (skb_checksum_trimmed) from [<803a06e8>] (ipv6_mc_check_mld+0x118/0x338)
      [   44.051501] [<803a06e8>] (ipv6_mc_check_mld) from [<803b2c98>] (br_multicast_rcv+0x5dc/0xd00)
      [   44.060077] [<803b2c98>] (br_multicast_rcv) from [<803aa510>] (br_handle_frame_finish+0xac/0x51c)
      [...]
      
      Fixes: 9afd85c9 ("net: Export IGMP/MLD message validation code")
      Reported-by: default avatarÁlvaro Fernández Rojas <noltari@gmail.com>
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44bc7d1b