1. 13 Sep, 2019 6 commits
    • Bjørn Mork's avatar
      cdc_ether: fix rndis support for Mediatek based smartphones · 4d7ffcf3
      Bjørn Mork authored
      A Mediatek based smartphone owner reports problems with USB
      tethering in Linux.  The verbose USB listing shows a rndis_host
      interface pair (e0/01/03 + 10/00/00), but the driver fails to
      bind with
      
      [  355.960428] usb 1-4: bad CDC descriptors
      
      The problem is a failsafe test intended to filter out ACM serial
      functions using the same 02/02/ff class/subclass/protocol as RNDIS.
      The serial functions are recognized by their non-zero bmCapabilities.
      
      No RNDIS function with non-zero bmCapabilities were known at the time
      this failsafe was added. But it turns out that some Wireless class
      RNDIS functions are using the bmCapabilities field. These functions
      are uniquely identified as RNDIS by their class/subclass/protocol, so
      the failing test can safely be disabled.  The same applies to the two
      types of Misc class RNDIS functions.
      
      Applying the failsafe to Communication class functions only retains
      the original functionality, and fixes the problem for the Mediatek based
      smartphone.
      
      Tow examples of CDC functional descriptors with non-zero bmCapabilities
      from Wireless class RNDIS functions are:
      
      0e8d:000a  Mediatek Crosscall Spider X5 3G Phone
      
            CDC Header:
              bcdCDC               1.10
            CDC ACM:
              bmCapabilities       0x0f
                connection notifications
                sends break
                line coding and serial state
                get/set/clear comm features
            CDC Union:
              bMasterInterface        0
              bSlaveInterface         1
            CDC Call Management:
              bmCapabilities       0x03
                call management
                use DataInterface
              bDataInterface          1
      
      and
      
      19d2:1023  ZTE K4201-z
      
            CDC Header:
              bcdCDC               1.10
            CDC ACM:
              bmCapabilities       0x02
                line coding and serial state
            CDC Call Management:
              bmCapabilities       0x03
                call management
                use DataInterface
              bDataInterface          1
            CDC Union:
              bMasterInterface        0
              bSlaveInterface         1
      
      The Mediatek example is believed to apply to most smartphones with
      Mediatek firmware.  The ZTE example is most likely also part of a larger
      family of devices/firmwares.
      Suggested-by: default avatarLars Melin <larsm17@gmail.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d7ffcf3
    • David S. Miller's avatar
      Merge branch 'sctp_do_bind-leak' · ae3b06ed
      David S. Miller authored
      Mao Wenan says:
      
      ====================
      fix memory leak for sctp_do_bind
      
      First two patches are to do cleanup, remove redundant assignment,
      and change return type of sctp_get_port_local.
      Third patch is to fix memory leak for sctp_do_bind if failed
      to bind address.
      
      v2: add one patch to change return type of sctp_get_port_local.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae3b06ed
    • Mao Wenan's avatar
      sctp: destroy bucket if failed to bind addr · 29b99f54
      Mao Wenan authored
      There is one memory leak bug report:
      BUG: memory leak
      unreferenced object 0xffff8881dc4c5ec0 (size 40):
        comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s)
        hex dump (first 32 bytes):
          02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00  ................
          f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00  .c=.............
        backtrace:
          [<0000000072006339>] sctp_get_port_local+0x2a1/0xa00 [sctp]
          [<00000000c7b379ec>] sctp_do_bind+0x176/0x2c0 [sctp]
          [<000000005be274a2>] sctp_bind+0x5a/0x80 [sctp]
          [<00000000b66b4044>] inet6_bind+0x59/0xd0 [ipv6]
          [<00000000c68c7f42>] __sys_bind+0x120/0x1f0 net/socket.c:1647
          [<000000004513635b>] __do_sys_bind net/socket.c:1658 [inline]
          [<000000004513635b>] __se_sys_bind net/socket.c:1656 [inline]
          [<000000004513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656
          [<0000000061f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296
          [<0000000003d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      This is because in sctp_do_bind, if sctp_get_port_local is to
      create hash bucket successfully, and sctp_add_bind_addr failed
      to bind address, e.g return -ENOMEM, so memory leak found, it
      needs to destroy allocated bucket.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      29b99f54
    • Mao Wenan's avatar
      sctp: remove redundant assignment when call sctp_get_port_local · e0e4b8de
      Mao Wenan authored
      There are more parentheses in if clause when call sctp_get_port_local
      in sctp_do_bind, and redundant assignment to 'ret'. This patch is to
      do cleanup.
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0e4b8de
    • Mao Wenan's avatar
      sctp: change return type of sctp_get_port_local · 8e2ef6ab
      Mao Wenan authored
      Currently sctp_get_port_local() returns a long
      which is either 0,1 or a pointer casted to long.
      It's neither of the callers use the return value since
      commit 62208f12 ("net: sctp: simplify sctp_get_port").
      Now two callers are sctp_get_port and sctp_do_bind,
      they actually assumend a casted to an int was the same as
      a pointer casted to a long, and they don't save the return
      value just check whether it is zero or non-zero, so
      it would better change return type from long to int for
      sctp_get_port_local.
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e2ef6ab
    • Jeff Kirsher's avatar
      ixgbevf: Fix secpath usage for IPsec Tx offload · 8f6617ba
      Jeff Kirsher authored
      Port the same fix for ixgbe to ixgbevf.
      
      The ixgbevf driver currently does IPsec Tx offloading
      based on an existing secpath. However, the secpath
      can also come from the Rx side, in this case it is
      misinterpreted for Tx offload and the packets are
      dropped with a "bad sa_idx" error. Fix this by using
      the xfrm_offload() function to test for Tx offload.
      
      CC: Shannon Nelson <snelson@pensando.io>
      Fixes: 7f68d430 ("ixgbevf: enable VF IPsec offload operations")
      Reported-by: default avatarJonathan Tooker <jonathan@reliablehosting.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Acked-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f6617ba
  2. 12 Sep, 2019 6 commits
    • Christophe JAILLET's avatar
      sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()' · b456d724
      Christophe JAILLET authored
      The '.exit' functions from 'pernet_operations' structure should be marked
      as __net_exit, not __net_init.
      
      Fixes: 8e2d61e0 ("sctp: fix race on protocol/netns initialization")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b456d724
    • Steffen Klassert's avatar
      ixgbe: Fix secpath usage for IPsec TX offload. · f39b683d
      Steffen Klassert authored
      The ixgbe driver currently does IPsec TX offloading
      based on an existing secpath. However, the secpath
      can also come from the RX side, in this case it is
      misinterpreted for TX offload and the packets are
      dropped with a "bad sa_idx" error. Fix this by using
      the xfrm_offload() function to test for TX offload.
      
      Fixes: 59259470 ("ixgbe: process the Tx ipsec offload")
      Reported-by: default avatarMichael Marley <michael@michaelmarley.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f39b683d
    • Navid Emamdoost's avatar
      net: qrtr: fix memort leak in qrtr_tun_write_iter · a21b7f0c
      Navid Emamdoost authored
      In qrtr_tun_write_iter the allocated kbuf should be release in case of
      error or success return.
      
      v2 Update: Thanks to David Miller for pointing out the release on success
      path as well.
      Signed-off-by: default avatarNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a21b7f0c
    • Subash Abhinov Kasiviswanathan's avatar
      net: Fix null de-reference of device refcount · 10cc514f
      Subash Abhinov Kasiviswanathan authored
      In event of failure during register_netdevice, free_netdev is
      invoked immediately. free_netdev assumes that all the netdevice
      refcounts have been dropped prior to it being called and as a
      result frees and clears out the refcount pointer.
      
      However, this is not necessarily true as some of the operations
      in the NETDEV_UNREGISTER notifier handlers queue RCU callbacks for
      invocation after a grace period. The IPv4 callback in_dev_rcu_put
      tries to access the refcount after free_netdev is called which
      leads to a null de-reference-
      
      44837.761523:   <6> Unable to handle kernel paging request at
                          virtual address 0000004a88287000
      44837.761651:   <2> pc : in_dev_finish_destroy+0x4c/0xc8
      44837.761654:   <2> lr : in_dev_finish_destroy+0x2c/0xc8
      44837.762393:   <2> Call trace:
      44837.762398:   <2>  in_dev_finish_destroy+0x4c/0xc8
      44837.762404:   <2>  in_dev_rcu_put+0x24/0x30
      44837.762412:   <2>  rcu_nocb_kthread+0x43c/0x468
      44837.762418:   <2>  kthread+0x118/0x128
      44837.762424:   <2>  ret_from_fork+0x10/0x1c
      
      Fix this by waiting for the completion of the call_rcu() in
      case of register_netdevice errors.
      
      Fixes: 93ee31f1 ("[NET]: Fix free_netdev on register_netdev failure.")
      Cc: Sean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10cc514f
    • Christophe JAILLET's avatar
      ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()' · d23dbc47
      Christophe JAILLET authored
      The '.exit' functions from 'pernet_operations' structure should be marked
      as __net_exit, not __net_init.
      
      Fixes: d862e546 ("net: ipv6: Implement /proc/net/icmp6.")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d23dbc47
    • Yang Yingliang's avatar
      tun: fix use-after-free when register netdev failed · 77f22f92
      Yang Yingliang authored
      I got a UAF repport in tun driver when doing fuzzy test:
      
      [  466.269490] ==================================================================
      [  466.271792] BUG: KASAN: use-after-free in tun_chr_read_iter+0x2ca/0x2d0
      [  466.271806] Read of size 8 at addr ffff888372139250 by task tun-test/2699
      [  466.271810]
      [  466.271824] CPU: 1 PID: 2699 Comm: tun-test Not tainted 5.3.0-rc1-00001-g5a9433db2614-dirty #427
      [  466.271833] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [  466.271838] Call Trace:
      [  466.271858]  dump_stack+0xca/0x13e
      [  466.271871]  ? tun_chr_read_iter+0x2ca/0x2d0
      [  466.271890]  print_address_description+0x79/0x440
      [  466.271906]  ? vprintk_func+0x5e/0xf0
      [  466.271920]  ? tun_chr_read_iter+0x2ca/0x2d0
      [  466.271935]  __kasan_report+0x15c/0x1df
      [  466.271958]  ? tun_chr_read_iter+0x2ca/0x2d0
      [  466.271976]  kasan_report+0xe/0x20
      [  466.271987]  tun_chr_read_iter+0x2ca/0x2d0
      [  466.272013]  do_iter_readv_writev+0x4b7/0x740
      [  466.272032]  ? default_llseek+0x2d0/0x2d0
      [  466.272072]  do_iter_read+0x1c5/0x5e0
      [  466.272110]  vfs_readv+0x108/0x180
      [  466.299007]  ? compat_rw_copy_check_uvector+0x440/0x440
      [  466.299020]  ? fsnotify+0x888/0xd50
      [  466.299040]  ? __fsnotify_parent+0xd0/0x350
      [  466.299064]  ? fsnotify_first_mark+0x1e0/0x1e0
      [  466.304548]  ? vfs_write+0x264/0x510
      [  466.304569]  ? ksys_write+0x101/0x210
      [  466.304591]  ? do_preadv+0x116/0x1a0
      [  466.304609]  do_preadv+0x116/0x1a0
      [  466.309829]  do_syscall_64+0xc8/0x600
      [  466.309849]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  466.309861] RIP: 0033:0x4560f9
      [  466.309875] Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      [  466.309889] RSP: 002b:00007ffffa5166e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000127
      [  466.322992] RAX: ffffffffffffffda RBX: 0000000000400460 RCX: 00000000004560f9
      [  466.322999] RDX: 0000000000000003 RSI: 00000000200008c0 RDI: 0000000000000003
      [  466.323007] RBP: 00007ffffa516700 R08: 0000000000000004 R09: 0000000000000000
      [  466.323014] R10: 0000000000000000 R11: 0000000000000206 R12: 000000000040cb10
      [  466.323021] R13: 0000000000000000 R14: 00000000006d7018 R15: 0000000000000000
      [  466.323057]
      [  466.323064] Allocated by task 2605:
      [  466.335165]  save_stack+0x19/0x80
      [  466.336240]  __kasan_kmalloc.constprop.8+0xa0/0xd0
      [  466.337755]  kmem_cache_alloc+0xe8/0x320
      [  466.339050]  getname_flags+0xca/0x560
      [  466.340229]  user_path_at_empty+0x2c/0x50
      [  466.341508]  vfs_statx+0xe6/0x190
      [  466.342619]  __do_sys_newstat+0x81/0x100
      [  466.343908]  do_syscall_64+0xc8/0x600
      [  466.345303]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  466.347034]
      [  466.347517] Freed by task 2605:
      [  466.348471]  save_stack+0x19/0x80
      [  466.349476]  __kasan_slab_free+0x12e/0x180
      [  466.350726]  kmem_cache_free+0xc8/0x430
      [  466.351874]  putname+0xe2/0x120
      [  466.352921]  filename_lookup+0x257/0x3e0
      [  466.354319]  vfs_statx+0xe6/0x190
      [  466.355498]  __do_sys_newstat+0x81/0x100
      [  466.356889]  do_syscall_64+0xc8/0x600
      [  466.358037]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  466.359567]
      [  466.360050] The buggy address belongs to the object at ffff888372139100
      [  466.360050]  which belongs to the cache names_cache of size 4096
      [  466.363735] The buggy address is located 336 bytes inside of
      [  466.363735]  4096-byte region [ffff888372139100, ffff88837213a100)
      [  466.367179] The buggy address belongs to the page:
      [  466.368604] page:ffffea000dc84e00 refcount:1 mapcount:0 mapping:ffff8883df1b4f00 index:0x0 compound_mapcount: 0
      [  466.371582] flags: 0x2fffff80010200(slab|head)
      [  466.372910] raw: 002fffff80010200 dead000000000100 dead000000000122 ffff8883df1b4f00
      [  466.375209] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
      [  466.377778] page dumped because: kasan: bad access detected
      [  466.379730]
      [  466.380288] Memory state around the buggy address:
      [  466.381844]  ffff888372139100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.384009]  ffff888372139180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.386131] >ffff888372139200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.388257]                                                  ^
      [  466.390234]  ffff888372139280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.392512]  ffff888372139300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.394667] ==================================================================
      
      tun_chr_read_iter() accessed the memory which freed by free_netdev()
      called by tun_set_iff():
      
              CPUA                                           CPUB
        tun_set_iff()
          alloc_netdev_mqs()
          tun_attach()
                                                        tun_chr_read_iter()
                                                          tun_get()
                                                          tun_do_read()
                                                            tun_ring_recv()
          register_netdevice() <-- inject error
          goto err_detach
          tun_detach_all() <-- set RCV_SHUTDOWN
          free_netdev() <-- called from
                           err_free_dev path
            netdev_freemem() <-- free the memory
                              without check refcount
            (In this path, the refcount cannot prevent
             freeing the memory of dev, and the memory
             will be used by dev_put() called by
             tun_chr_read_iter() on CPUB.)
                                                           (Break from tun_ring_recv(),
                                                           because RCV_SHUTDOWN is set)
                                                         tun_put()
                                                           dev_put() <-- use the memory
                                                                         freed by netdev_freemem()
      
      Put the publishing of tfile->tun after register_netdevice(),
      so tun_get() won't get the tun pointer that freed by
      err_detach path if register_netdevice() failed.
      
      Fixes: eb0fb363 ("tuntap: attach queue 0 before registering netdevice")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Suggested-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77f22f92
  3. 11 Sep, 2019 13 commits
  4. 10 Sep, 2019 5 commits
  5. 07 Sep, 2019 8 commits
    • Fred Lotter's avatar
      nfp: flower: cmsg rtnl locks can timeout reify messages · 28abe579
      Fred Lotter authored
      Flower control message replies are handled in different locations. The truly
      high priority replies are handled in the BH (tasklet) context, while the
      remaining replies are handled in a predefined Linux work queue. The work
      queue handler orders replies into high and low priority groups, and always
      start servicing the high priority replies within the received batch first.
      
      Reply Type:			Rtnl Lock:	Handler:
      
      CMSG_TYPE_PORT_MOD		no		BH tasklet (mtu)
      CMSG_TYPE_TUN_NEIGH		no		BH tasklet
      CMSG_TYPE_FLOW_STATS		no		BH tasklet
      CMSG_TYPE_PORT_REIFY		no		WQ high
      CMSG_TYPE_PORT_MOD		yes		WQ high (link/mtu)
      CMSG_TYPE_MERGE_HINT		yes		WQ low
      CMSG_TYPE_NO_NEIGH		no		WQ low
      CMSG_TYPE_ACTIVE_TUNS		no		WQ low
      CMSG_TYPE_QOS_STATS		no		WQ low
      CMSG_TYPE_LAG_CONFIG		no		WQ low
      
      A subset of control messages can block waiting for an rtnl lock (from both
      work queue priority groups). The rtnl lock is heavily contended for by
      external processes such as systemd-udevd, systemd-network and libvirtd,
      especially during netdev creation, such as when flower VFs and representors
      are instantiated.
      
      Kernel netlink instrumentation shows that external processes (such as
      systemd-udevd) often use successive rtnl_trylock() sequences, which can result
      in an rtnl_lock() blocked control message to starve for longer periods of time
      during rtnl lock contention, i.e. netdev creation.
      
      In the current design a single blocked control message will block the entire
      work queue (both priorities), and introduce a latency which is
      nondeterministic and dependent on system wide rtnl lock usage.
      
      In some extreme cases, one blocked control message at exactly the wrong time,
      just before the maximum number of VFs are instantiated, can block the work
      queue for long enough to prevent VF representor REIFY replies from getting
      handled in time for the 40ms timeout.
      
      The firmware will deliver the total maximum number of REIFY message replies in
      around 300us.
      
      Only REIFY and MTU update messages require replies within a timeout period (of
      40ms). The MTU-only updates are already done directly in the BH (tasklet)
      handler.
      
      Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts
      caused by a blocked work queue waiting on rtnl locks.
      Signed-off-by: default avatarFred Lotter <frederik.lotter@netronome.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28abe579
    • Shmulik Ladkani's avatar
      net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list · 3dcbdb13
      Shmulik Ladkani authored
      Historically, support for frag_list packets entering skb_segment() was
      limited to frag_list members terminating on exact same gso_size
      boundaries. This is verified with a BUG_ON since commit 89319d38
      ("net: Add frag_list support to skb_segment"), quote:
      
          As such we require all frag_list members terminate on exact MSS
          boundaries.  This is checked using BUG_ON.
          As there should only be one producer in the kernel of such packets,
          namely GRO, this requirement should not be difficult to maintain.
      
      However, since commit 6578171a ("bpf: add bpf_skb_change_proto helper"),
      the "exact MSS boundaries" assumption no longer holds:
      An eBPF program using bpf_skb_change_proto() DOES modify 'gso_size', but
      leaves the frag_list members as originally merged by GRO with the
      original 'gso_size'. Example of such programs are bpf-based NAT46 or
      NAT64.
      
      This lead to a kernel BUG_ON for flows involving:
       - GRO generating a frag_list skb
       - bpf program performing bpf_skb_change_proto() or bpf_skb_adjust_room()
       - skb_segment() of the skb
      
      See example BUG_ON reports in [0].
      
      In commit 13acc94e ("net: permit skb_segment on head_frag frag_list skb"),
      skb_segment() was modified to support the "gso_size mangling" case of
      a frag_list GRO'ed skb, but *only* for frag_list members having
      head_frag==true (having a page-fragment head).
      
      Alas, GRO packets having frag_list members with a linear kmalloced head
      (head_frag==false) still hit the BUG_ON.
      
      This commit adds support to skb_segment() for a 'head_skb' packet having
      a frag_list whose members are *non* head_frag, with gso_size mangled, by
      disabling SG and thus falling-back to copying the data from the given
      'head_skb' into the generated segmented skbs - as suggested by Willem de
      Bruijn [1].
      
      Since this approach involves the penalty of skb_copy_and_csum_bits()
      when building the segments, care was taken in order to enable this
      solution only when required:
       - untrusted gso_size, by testing SKB_GSO_DODGY is set
         (SKB_GSO_DODGY is set by any gso_size mangling functions in
          net/core/filter.c)
       - the frag_list is non empty, its item is a non head_frag, *and* the
         headlen of the given 'head_skb' does not match the gso_size.
      
      [0]
      https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
      https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/
      
      [1]
      https://lore.kernel.org/netdev/CA+FuTSfVsgNDi7c=GUU8nMg2hWxF2SjCNLXetHeVPdnxAW5K-w@mail.gmail.com/
      
      Fixes: 6578171a ("bpf: add bpf_skb_change_proto helper")
      Suggested-by: default avatarWillem de Bruijn <willemdebruijn.kernel@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3dcbdb13
    • Maciej Żenczykowski's avatar
      ipv6: addrconf_f6i_alloc - fix non-null pointer check to !IS_ERR() · 8652f17c
      Maciej Żenczykowski authored
      Fixes a stupid bug I recently introduced...
      ip6_route_info_create() returns an ERR_PTR(err) and not a NULL on error.
      
      Fixes: d55a2e37 ("net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)'")
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8652f17c
    • Eric Biggers's avatar
      isdn/capi: check message length in capi_write() · fe163e53
      Eric Biggers authored
      syzbot reported:
      
          BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
          CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ #2
          Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
          Call Trace:
            __dump_stack lib/dump_stack.c:77 [inline]
            dump_stack+0x173/0x1d0 lib/dump_stack.c:113
            kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
            __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
            capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
            do_loop_readv_writev fs/read_write.c:703 [inline]
            do_iter_write+0x83e/0xd80 fs/read_write.c:961
            vfs_writev fs/read_write.c:1004 [inline]
            do_writev+0x397/0x840 fs/read_write.c:1039
            __do_sys_writev fs/read_write.c:1112 [inline]
            __se_sys_writev+0x9b/0xb0 fs/read_write.c:1109
            __x64_sys_writev+0x4a/0x70 fs/read_write.c:1109
            do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
            entry_SYSCALL_64_after_hwframe+0x63/0xe7
          [...]
      
      The problem is that capi_write() is reading past the end of the message.
      Fix it by checking the message's length in the needed places.
      
      Reported-and-tested-by: syzbot+0849c524d9c634f5ae66@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe163e53
    • Juliet Kim's avatar
      net/ibmvnic: free reset work of removed device from queue · 1c2977c0
      Juliet Kim authored
      Commit 36f1031c ("ibmvnic: Do not process reset during or after
       device removal") made the change to exit reset if the driver has been
      removed, but does not free reset work items of the adapter from queue.
      
      Ensure all reset work items are freed when breaking out of the loop early.
      
      Fixes: 36f1031c ("ibmnvic: Do not process reset during or after device removal”)
      Signed-off-by: default avatarJuliet Kim <julietk@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1c2977c0
    • Stefan Chulski's avatar
      net: phylink: Fix flow control resolution · 63b2ed4e
      Stefan Chulski authored
      Regarding to IEEE 802.3-2015 standard section 2
      28B.3 Priority resolution - Table 28-3 - Pause resolution
      
      In case of Local device Pause=1 AsymDir=0, Link partner
      Pause=1 AsymDir=1, Local device resolution should be enable PAUSE
      transmit, disable PAUSE receive.
      And in case of Local device Pause=1 AsymDir=1, Link partner
      Pause=1 AsymDir=0, Local device resolution should be enable PAUSE
      receive, disable PAUSE transmit.
      
      Fixes: 9525ae83 ("phylink: add phylink infrastructure")
      Signed-off-by: default avatarStefan Chulski <stefanc@marvell.com>
      Reported-by: default avatarShaul Ben-Mayor <shaulb@marvell.com>
      Acked-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63b2ed4e
    • Christophe JAILLET's avatar
      net/hamradio/6pack: Fix the size of a sk_buff used in 'sp_bump()' · b82573fd
      Christophe JAILLET authored
      We 'allocate' 'count' bytes here. In fact, 'dev_alloc_skb' already add some
      extra space for padding, so a bit more is allocated.
      
      However, we use 1 byte for the KISS command, then copy 'count' bytes, so
      count+1 bytes.
      
      Explicitly allocate and use 1 more byte to be safe.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b82573fd
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 0c04eb72
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2019-09-06
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) verifier precision tracking fix, from Alexei.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c04eb72
  6. 06 Sep, 2019 2 commits