1. 23 Nov, 2015 2 commits
    • Grant Grundler's avatar
      net: tulip: update MAINTAINER status to Orphan · cf869eb1
      Grant Grundler authored
      I haven't had any PCI tulip HW for the past ~5 years. I have
      been reviewing tulip patches and can continue doing that.
      Signed-off-by: default avatarGrant Grundler <grundler@parisc-linux.org>
      Acked-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf869eb1
    • Daniel Borkmann's avatar
      net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds · 6900317f
      Daniel Borkmann authored
      David and HacKurx reported a following/similar size overflow triggered
      in a grsecurity kernel, thanks to PaX's gcc size overflow plugin:
      
      (Already fixed in later grsecurity versions by Brad and PaX Team.)
      
      [ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314
                     cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr;
      [ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7
      [ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, [...]
      [ 1002.296153]  ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8
      [ 1002.296162]  ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8
      [ 1002.296169]  ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60
      [ 1002.296176] Call Trace:
      [ 1002.296190]  [<ffffffff818129ba>] dump_stack+0x45/0x57
      [ 1002.296200]  [<ffffffff8121f838>] report_size_overflow+0x38/0x60
      [ 1002.296209]  [<ffffffff816a979e>] scm_detach_fds+0x2ce/0x300
      [ 1002.296220]  [<ffffffff81791899>] unix_stream_read_generic+0x609/0x930
      [ 1002.296228]  [<ffffffff81791c9f>] unix_stream_recvmsg+0x4f/0x60
      [ 1002.296236]  [<ffffffff8178dc00>] ? unix_set_peek_off+0x50/0x50
      [ 1002.296243]  [<ffffffff8168fac7>] sock_recvmsg+0x47/0x60
      [ 1002.296248]  [<ffffffff81691522>] ___sys_recvmsg+0xe2/0x1e0
      [ 1002.296257]  [<ffffffff81693496>] __sys_recvmsg+0x46/0x80
      [ 1002.296263]  [<ffffffff816934fc>] SyS_recvmsg+0x2c/0x40
      [ 1002.296271]  [<ffffffff8181a3ab>] entry_SYSCALL_64_fastpath+0x12/0x85
      
      Further investigation showed that this can happen when an *odd* number of
      fds are being passed over AF_UNIX sockets.
      
      In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)),
      where i is the number of successfully passed fds, differ by 4 bytes due
      to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary
      on 64 bit. The padding is used to align subsequent cmsg headers in the
      control buffer.
      
      When the control buffer passed in from the receiver side *lacks* these 4
      bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will
      overflow in scm_detach_fds():
      
        int cmlen = CMSG_LEN(i * sizeof(int));  <--- cmlen w/o tail-padding
        err = put_user(SOL_SOCKET, &cm->cmsg_level);
        if (!err)
          err = put_user(SCM_RIGHTS, &cm->cmsg_type);
        if (!err)
          err = put_user(cmlen, &cm->cmsg_len);
        if (!err) {
          cmlen = CMSG_SPACE(i * sizeof(int));  <--- cmlen w/ 4 byte extra tail-padding
          msg->msg_control += cmlen;
          msg->msg_controllen -= cmlen;         <--- iff no tail-padding space here ...
        }                                            ... wrap-around
      
      F.e. it will wrap to a length of 18446744073709551612 bytes in case the
      receiver passed in msg->msg_controllen of 20 bytes, and the sender
      properly transferred 1 fd to the receiver, so that its CMSG_LEN results
      in 20 bytes and CMSG_SPACE in 24 bytes.
      
      In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an
      issue in my tests as alignment seems always on 4 byte boundary. Same
      should be in case of native 32 bit, where we end up with 4 byte boundaries
      as well.
      
      In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving
      a single fd would mean that on successful return, msg->msg_controllen is
      being set by the kernel to 24 bytes instead, thus more than the input
      buffer advertised. It could f.e. become an issue if such application later
      on zeroes or copies the control buffer based on the returned msg->msg_controllen
      elsewhere.
      
      Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253).
      
      Going over the code, it seems like msg->msg_controllen is not being read
      after scm_detach_fds() in scm_recv() anymore by the kernel, good!
      
      Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg())
      and unix_stream_recvmsg(). Both return back to their recvmsg() caller,
      and ___sys_recvmsg() places the updated length, that is, new msg_control -
      old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen
      in the example).
      
      Long time ago, Wei Yongjun fixed something related in commit 1ac70e7a
      ("[NET]: Fix function put_cmsg() which may cause usr application memory
      overflow").
      
      RFC3542, section 20.2. says:
      
        The fields shown as "XX" are possible padding, between the cmsghdr
        structure and the data, and between the data and the next cmsghdr
        structure, if required by the implementation. While sending an
        application may or may not include padding at the end of last
        ancillary data in msg_controllen and implementations must accept both
        as valid. On receiving a portable application must provide space for
        padding at the end of the last ancillary data as implementations may
        copy out the padding at the end of the control message buffer and
        include it in the received msg_controllen. When recvmsg() is called
        if msg_controllen is too small for all the ancillary data items
        including any trailing padding after the last item an implementation
        may set MSG_CTRUNC.
      
      Since we didn't place MSG_CTRUNC for already quite a long time, just do
      the same as in 1ac70e7a to avoid an overflow.
      
      Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix
      error in SCM_RIGHTS code sample"). Some people must have copied this (?),
      thus it got triggered in the wild (reported several times during boot by
      David and HacKurx).
      
      No Fixes tag this time as pre 2002 (that is, pre history tree).
      Reported-by: default avatarDavid Sterba <dave@jikos.cz>
      Reported-by: default avatarHacKurx <hackurx@gmail.com>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: Eric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6900317f
  2. 20 Nov, 2015 8 commits
  3. 19 Nov, 2015 4 commits
    • Eric Dumazet's avatar
      tcp: md5: fix lockdep annotation · 1b8e6a01
      Eric Dumazet authored
      When a passive TCP is created, we eventually call tcp_md5_do_add()
      with sk pointing to the child. It is not owner by the user yet (we
      will add this socket into listener accept queue a bit later anyway)
      
      But we do own the spinlock, so amend the lockdep annotation to avoid
      following splat :
      
      [ 8451.090932] net/ipv4/tcp_ipv4.c:923 suspicious rcu_dereference_protected() usage!
      [ 8451.090932]
      [ 8451.090932] other info that might help us debug this:
      [ 8451.090932]
      [ 8451.090934]
      [ 8451.090934] rcu_scheduler_active = 1, debug_locks = 1
      [ 8451.090936] 3 locks held by socket_sockopt_/214795:
      [ 8451.090936]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff855c6ac1>] __netif_receive_skb_core+0x151/0xe90
      [ 8451.090947]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff85618143>] ip_local_deliver_finish+0x43/0x2b0
      [ 8451.090952]  #2:  (slock-AF_INET){+.-...}, at: [<ffffffff855acda5>] sk_clone_lock+0x1c5/0x500
      [ 8451.090958]
      [ 8451.090958] stack backtrace:
      [ 8451.090960] CPU: 7 PID: 214795 Comm: socket_sockopt_
      
      [ 8451.091215] Call Trace:
      [ 8451.091216]  <IRQ>  [<ffffffff856fb29c>] dump_stack+0x55/0x76
      [ 8451.091229]  [<ffffffff85123b5b>] lockdep_rcu_suspicious+0xeb/0x110
      [ 8451.091235]  [<ffffffff8564544f>] tcp_md5_do_add+0x1bf/0x1e0
      [ 8451.091239]  [<ffffffff85645751>] tcp_v4_syn_recv_sock+0x1f1/0x4c0
      [ 8451.091242]  [<ffffffff85642b27>] ? tcp_v4_md5_hash_skb+0x167/0x190
      [ 8451.091246]  [<ffffffff85647c78>] tcp_check_req+0x3c8/0x500
      [ 8451.091249]  [<ffffffff856451ae>] ? tcp_v4_inbound_md5_hash+0x11e/0x190
      [ 8451.091253]  [<ffffffff85647170>] tcp_v4_rcv+0x3c0/0x9f0
      [ 8451.091256]  [<ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0
      [ 8451.091260]  [<ffffffff856181b6>] ip_local_deliver_finish+0xb6/0x2b0
      [ 8451.091263]  [<ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0
      [ 8451.091267]  [<ffffffff85618d38>] ip_local_deliver+0x48/0x80
      [ 8451.091270]  [<ffffffff85618510>] ip_rcv_finish+0x160/0x700
      [ 8451.091273]  [<ffffffff8561900e>] ip_rcv+0x29e/0x3d0
      [ 8451.091277]  [<ffffffff855c74b7>] __netif_receive_skb_core+0xb47/0xe90
      
      Fixes: a8afca03 ("tcp: md5: protects md5sig_info with RCU")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b8e6a01
    • Bjørn Mork's avatar
      net: qmi_wwan: add XS Stick W100-2 from 4G Systems · 68242a5a
      Bjørn Mork authored
      Thomas reports
      "
      4gsystems sells two total different LTE-surfsticks under the same name.
      ..
      The newer version of XS Stick W100 is from "omega"
      ..
      Under windows the driver switches to the same ID, and uses MI03\6 for
      network and MI01\6 for modem.
      ..
      echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
      echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id
      
      T:  Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=1c9e ProdID=9b01 Rev=02.32
      S:  Manufacturer=USB Modem
      S:  Product=USB Modem
      S:  SerialNumber=
      C:  #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      I:  If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
      
      Now all important things are there:
      
      wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)
      
      There is also ttyUSB0, but it is not usable, at least not for at.
      
      The device works well with qmi and ModemManager-NetworkManager.
      "
      Reported-by: default avatarThomas Schäfer <tschaefer@t-online.de>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68242a5a
    • Zi Shen Lim's avatar
      arm64: bpf: fix buffer pointer · f4b16fce
      Zi Shen Lim authored
      During code review, I noticed we were passing a bad buffer pointer
      to bpf_load_pointer helper function called by jitted code.
      
      Point to the buffer allocated by JIT, so we don't silently corrupt
      other parts of the stack.
      Signed-off-by: default avatarZi Shen Lim <zlim.lnx@gmail.com>
      Acked-by: default avatarYang Shi <yang.shi@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4b16fce
    • Sergei Shtylyov's avatar
      ravb: fix WARNING in __free_irq() · 508dc064
      Sergei Shtylyov authored
      When the R8A7795 support was  added to the driver, little attention was paid
      to the ravb_open() error path: free_irq()  for the EMAC interrupt was called
      uncoditionally, unlike request_irq(), and in  a wrong order as well...
      As a result, on the R-Car gen2 SoCs I started getting the following in case
      of a device  opening error:
      
      WARNING: CPU: 0 PID: 1 at kernel/irq/manage.c:1448 __free_irq+0x8c/0x228()
      Trying to free already-free IRQ 0
      Modules linked in:
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc1-dirty #1005
      Hardware name: Generic R8A7791 (Flattened Device Tree)
      Backtrace:
      [<c0013818>] (dump_backtrace) from [<c00139b4>] (show_stack+0x18/0x1c)
       r6:c063cdd6 r5:00000009 r4:00000000 r3:00204140
      [<c001399c>] (show_stack) from [<c022a578>] (dump_stack+0x74/0x90)
      [<c022a504>] (dump_stack) from [<c0025f04>] (warn_slowpath_common+0x8c/0xb8)
       r4:ef04fd38 r3:c0714770
      [<c0025e78>] (warn_slowpath_common) from [<c0025fd4>] (warn_slowpath_fmt+0x38/0x40)
       r8:ee8ad800 r7:ef0030a0 r6:00000000 r5:00000000 r4:ef003040
      [<c0025fa0>] (warn_slowpath_fmt) from [<c0064cc0>] (__free_irq+0x8c/0x228)
       r3:00000000 r2:c063ce9f
      [<c0064c34>] (__free_irq) from [<c0064ecc>] (free_irq+0x70/0xa4)
       r10:0000016b r8:00000000 r7:00000000 r6:ee8ad800 r5:00000000 r4:ef003040
      [<c0064e5c>] (free_irq) from [<c033472c>] (ravb_open+0x224/0x274)
       r7:fffffffe r6:00000000 r5:fffffffe r4:ee8ad800
      [<c0334508>] (ravb_open) from [<c041ac78>] (__dev_open+0x84/0x104)
       r7:ee8ad830 r6:c0566334 r5:00000000 r4:ee8ad800
      [<c041abf4>] (__dev_open) from [<c041af08>] (__dev_change_flags+0x94/0x13c)
       r7:00001002 r6:00000001 r5:00001003 r4:ee8ad800
      [<c041ae74>] (__dev_change_flags) from [<c041afe8>] (dev_change_flags+0x20/0x50)
       r7:c072e6e0 r6:00000138 r5:00001002 r4:ee8ad800
      [<c041afc8>] (dev_change_flags) from [<c06ec06c>] (ip_auto_config+0x174/0xfb8)
       r8:00001002 r7:c072e6e0 r6:c0703344 r5:00000001 r4:ee8ad800 r3:00000101
      [<c06ebef8>] (ip_auto_config) from [<c000a810>] (do_one_initcall+0x100/0x1cc)
       r10:c06fb83c r9:00000000 r8:c06ebef8 r7:c0736000 r6:c0710918 r5:c0710918
       r4:ef2f8f80
      [<c000a710>] (do_one_initcall) from [<c06ccddc>] (kernel_init_freeable+0x11c/0x1
      ec)
       r10:c06fb83c r9:00000000 r8:0000009a r7:c0736000 r6:c0706bf0 r5:c06fb834
       r4:00000007
      [<c06cccc0>] (kernel_init_freeable) from [<c0514c54>] (kernel_init+0x14/0xec)
       r10:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0514c40 r4:c0736000
      [<c0514c40>] (kernel_init) from [<c0010458>] (ret_from_fork+0x14/0x3c)
       r4:00000000 r3:ef04e000
      
      Fix up the free_irq() call order and add a new label on the error path.
      
      Fixes: 22d4df8f ("ravb: Add support for r8a7795 SoC")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Acked-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      508dc064
  4. 18 Nov, 2015 9 commits
  5. 17 Nov, 2015 17 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7f151f1d
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix list tests in netfilter ingress support, from Florian Westphal.
      
       2) Fix reversal of input and output interfaces in ingress hook
          invocation, from Pablo Neira Ayuso.
      
       3) We have a use after free in r8169, caught by Dave Jones, fixed by
          Francois Romieu.
      
       4) Splice use-after-free fix in AF_UNIX frmo Hannes Frederic Sowa.
      
       5) Three ipv6 route handling bug fixes from Martin KaFai Lau:
          a) Don't create clone routes not managed by the fib6 tree
          b) Don't forget to check expiration of DST_NOCACHE routes.
          c) Handle rt->dst.from == NULL properly.
      
       6) Several AF_PACKET fixes wrt transport header setting and SKB
          protocol setting, from Daniel Borkmann.
      
       7) Fix thunder driver crash on shutdown, from Pavel Fedin.
      
       8) Several Mellanox driver fixes (max MTU calculations, use of correct
          DMA unmap in TX path, etc.) from Saeed Mahameed, Tariq Toukan, Doron
          Tsur, Achiad Shochat, Eran Ben Elisha, and Noa Osherovich.
      
       9) Several mv88e6060 DSA driver fixes (wrong bit definitions for
          certain registers, etc.) from Neil Armstrong.
      
      10) Make sure to disable preemption while updating per-cpu stats of ip
          tunnels, from Jason A.  Donenfeld.
      
      11) Various ARM64 bpf JIT fixes, from Yang Shi.
      
      12) Flush icache properly in ARM JITs, from Daniel Borkmann.
      
      13) Fix masking of RX and TX interrupts in ravb driver, from Masaru
          Nagai.
      
      14) Fix netdev feature propagation for devices not implementing
          ->ndo_set_features().  From Nikolay Aleksandrov.
      
      15) Big endian fix in vmxnet3 driver, from Shrikrishna Khare.
      
      16) RAW socket code increments incorrect SNMP counters, fix from Ben
          Cartwright-Cox.
      
      17) IPv6 multicast SNMP counters are bumped twice, fix from Neil Horman.
      
      18) Fix handling of VLAN headers on stacked devices when REORDER is
          disabled.  From Vlad Yasevich.
      
      19) Fix SKB leaks and use-after-free in ipvlan and macvlan drivers, from
          Sabrina Dubroca.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
        MAINTAINERS: Update Mellanox's Eth NIC driver entries
        net/core: revert "net: fix __netdev_update_features return.." and add comment
        af_unix: take receive queue lock while appending new skb
        rtnetlink: fix frame size warning in rtnl_fill_ifinfo
        net: use skb_clone to avoid alloc_pages failure.
        packet: Use PAGE_ALIGNED macro
        packet: Don't check frames_per_block against negative values
        net: phy: Use interrupts when available in NOLINK state
        phy: marvell: Add support for 88E1540 PHY
        arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS
        macvlan: fix leak in macvlan_handle_frame
        ipvlan: fix use after free of skb
        ipvlan: fix leak in ipvlan_rcv_frame
        vlan: Do not put vlan headers back on bridge and macvlan ports
        vlan: Fix untag operations of stacked vlans with REORDER_HEADER off
        via-velocity: unconditionally drop frames with bad l2 length
        ipg: Remove ipg driver
        dl2k: Add support for IP1000A-based cards
        snmp: Remove duplicate OUTMCAST stat increment
        net: thunder: Check for driver data in nicvf_remove()
        ...
      7f151f1d
    • Or Gerlitz's avatar
      MAINTAINERS: Update Mellanox's Eth NIC driver entries · e7523a49
      Or Gerlitz authored
      Eugenia (Jenny) Emantayev is replacing Amir Vadai as the
      mlx4 Ethernet driver maintainer.
      
      Saeed Mahameed is assigned to maintain mlx5 Eth functionality.
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7523a49
    • Nikolay Aleksandrov's avatar
      net/core: revert "net: fix __netdev_update_features return.." and add comment · 17b85d29
      Nikolay Aleksandrov authored
      This reverts commit 00ee5927 ("net: fix __netdev_update_features return
      on ndo_set_features failure")
      and adds a comment explaining why it's okay to return a value other than
      0 upon error. Some drivers might actually change flags and return an
      error so it's better to fire a spurious notification rather than miss
      these.
      
      CC: Michał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17b85d29
    • Hannes Frederic Sowa's avatar
      af_unix: take receive queue lock while appending new skb · a3a116e0
      Hannes Frederic Sowa authored
      While possibly in future we don't necessarily need to use
      sk_buff_head.lock this is a rather larger change, as it affects the
      af_unix fd garbage collector, diag and socket cleanups. This is too much
      for a stable patch.
      
      For the time being grab sk_buff_head.lock without disabling bh and irqs,
      so don't use locked skb_queue_tail.
      
      Fixes: 869e7c62 ("net: af_unix: implement stream sendpage support")
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3a116e0
    • Hannes Frederic Sowa's avatar
      rtnetlink: fix frame size warning in rtnl_fill_ifinfo · b22b941b
      Hannes Frederic Sowa authored
      Fix the following warning:
      
        CC      net/core/rtnetlink.o
      net/core/rtnetlink.c: In function ‘rtnl_fill_ifinfo’:
      net/core/rtnetlink.c:1308:1: warning: the frame size of 2864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
       }
       ^
      by splitting up the huge rtnl_fill_ifinfo into some smaller ones, so we
      don't have the huge frame allocations at the same time.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b22b941b
    • Martin Zhang's avatar
      net: use skb_clone to avoid alloc_pages failure. · 19125c1a
      Martin Zhang authored
      1. new skb only need dst and ip address(v4 or v6).
      2. skb_copy may need high order pages, which is very rare on long running server.
      Signed-off-by: default avatarJunwei Zhang <linggao.zjw@alibaba-inc.com>
      Signed-off-by: default avatarMartin Zhang <martinbj2008@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19125c1a
    • Tobias Klauser's avatar
      packet: Use PAGE_ALIGNED macro · 90836b67
      Tobias Klauser authored
      Use PAGE_ALIGNED(...) instead of open-coding it.
      Signed-off-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90836b67
    • Tobias Klauser's avatar
      packet: Don't check frames_per_block against negative values · 4194b491
      Tobias Klauser authored
      rb->frames_per_block is an unsigned int, thus can never be negative.
      
      Also fix spacing in the calculation of frames_per_block.
      Signed-off-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4194b491
    • Andrew Lunn's avatar
      net: phy: Use interrupts when available in NOLINK state · 321beec5
      Andrew Lunn authored
      The NOLINK state will poll the phy once a second to see if the link
      has come up. If the phy has an interrupt line, this polling can be
      skipped, since the phy should interrupt when the link returns.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      321beec5
    • Andrew Lunn's avatar
      phy: marvell: Add support for 88E1540 PHY · 819ec8e1
      Andrew Lunn authored
      The 88E1540 can be found embedded in the Marvell 88E6352 switch.  It
      is compatible with the 88E1510, so add support for it, using the
      88E1510 specific functions.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      819ec8e1
    • Yang Shi's avatar
      arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS · ec0738db
      Yang Shi authored
      Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
      in prologue in order to get the correct stack backtrace.
      
      However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
      change during function call so it may cause the BPF prog stack base address
      change too.
      
      Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
      saved register, so it will keep intact during function call.
      It is initialized in BPF prog prologue when BPF prog is started to run
      everytime. Save and restore x25/x26 in BPF prologue and epilogue to keep
      them intact for the outside of BPF. Actually, x26 is unnecessary, but SP
      requires 16 bytes alignment.
      
      So, the BPF stack layout looks like:
      
                                       high
               original A64_SP =>   0:+-----+ BPF prologue
                                      |FP/LR|
               current A64_FP =>  -16:+-----+
                                      | ... | callee saved registers
                                      +-----+
                                      |     | x25/x26
               BPF fp register => -80:+-----+
                                      |     |
                                      | ... | BPF prog stack
                                      |     |
                                      |     |
               current A64_SP =>      +-----+
                                      |     |
                                      | ... | Function call stack
                                      |     |
                                      +-----+
                                        low
      
      CC: Zi Shen Lim <zlim.lnx@gmail.com>
      CC: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Acked-by: default avatarZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec0738db
    • Sabrina Dubroca's avatar
      macvlan: fix leak in macvlan_handle_frame · e639b8d8
      Sabrina Dubroca authored
      Reset pskb in macvlan_handle_frame in case skb_share_check returned a
      clone.
      
      Fixes: 8a4eb573 ("net: introduce rx_handler results and logic around that")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e639b8d8
    • Sabrina Dubroca's avatar
      ipvlan: fix use after free of skb · a534dc52
      Sabrina Dubroca authored
      ipvlan_handle_frame is a rx_handler, and when it returns a value other
      than RX_HANDLER_CONSUMED (here, NET_RX_DROP aka RX_HANDLER_ANOTHER),
      __netif_receive_skb_core expects that the skb still exists and will
      process it further, but we just freed it.
      
      Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a534dc52
    • Sabrina Dubroca's avatar
      ipvlan: fix leak in ipvlan_rcv_frame · cf554ada
      Sabrina Dubroca authored
      Pass a **skb to ipvlan_rcv_frame so that if skb_share_check returns a
      new skb, we actually use it during further processing.
      
      It's safe to ignore the new skb in the ipvlan_xmit_* functions, because
      they call ipvlan_rcv_frame with local == true, so that dev_forward_skb
      is called and always takes ownership of the skb.
      
      Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf554ada
    • David S. Miller's avatar
      Merge branch 'vlan-reorder' · eb3f8b42
      David S. Miller authored
      Vladislav Yasevich says:
      
      ====================
      Fix issues with vlans without REORDER_HEADER
      
      A while ago Phil Sutter brought up an issue with vlans without
      REORDER_HEADER and bridges.  The problem was that if a vlan
      without REORDER_HEADER was a port in the bridge, the bridge ended
      up forwarding corrupted packets that still contained the vlan header.
      The same issue exists for bridge mode macvlan/macvtap devices.
      
      An additional issue with vlans without REORDER_HEADER is that stacking
      them also doesn't work.  The reason here is that skb_reorder_vlan_header()
      function assumes that it on ETH_HLEN bytes deep into the packet.  That
      is not the case, when you a vlan without REORRDER_HEADER flag set.
      
      This series attempts to correct these 2 issues.
      
      1) To solve the stacked vlans problem, the patch simply use
      skb->mac_len as an offset to start copying mac addresses that
      is part of header reordering.
      
      2) To fix the issue with bridge/macvlan/macvtap, the second patch
      simply doesn't write the vlan header back to the packet if the
      vlan device is either a bridge or a macvlan port.  This ends up
      being the simplest and least performance intrussive solution.
      
      I've considered extending patch 2 to all stacked devices (essentially
      checked for the presense of rx_handler), but that feels like a broader
      restriction and _may_ break existing uses.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb3f8b42
    • Vlad Yasevich's avatar
      vlan: Do not put vlan headers back on bridge and macvlan ports · 28f9ee22
      Vlad Yasevich authored
      When a vlan is configured with REORDER_HEADER set to 0, the vlan
      header is put back into the packet and makes it appear that
      the vlan header is still there even after it's been processed.
      This posses a problem for bridge and macvlan ports.  The packets
      passed to those device may be forwarded and at the time of the
      forward, vlan headers end up being unexpectedly present.
      
      With the patch, we make sure that we do not put the vlan header
      back (when REORDER_HEADER is 0) if a bridge or macvlan has
      been configured on top of the vlan device.
      Signed-off-by: default avatarVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28f9ee22
    • Vlad Yasevich's avatar
      vlan: Fix untag operations of stacked vlans with REORDER_HEADER off · a6e18ff1
      Vlad Yasevich authored
      When we have multiple stacked vlan devices all of which have
      turned off REORDER_HEADER flag, the untag operation does not
      locate the ethernet addresses correctly for nested vlans.
      The reason is that in case of REORDER_HEADER flag being off,
      the outer vlan headers are put back and the mac_len is adjusted
      to account for the presense of the header.  Then, the subsequent
      untag operation, for the next level vlan, always use VLAN_ETH_HLEN
      to locate the begining of the ethernet header and that ends up
      being a multiple of 4 bytes short of the actuall beginning
      of the mac header (the multiple depending on the how many vlan
      encapsulations ethere are).
      
      As a reslult, if there are multiple levles of vlan devices
      with REODER_HEADER being off, the recevied packets end up
      being dropped.
      
      To solve this, we use skb->mac_len as the offset.  The value
      is always set on receive path and starts out as a ETH_HLEN.
      The value is also updated when the vlan header manupations occur
      so we know it will be correct.
      Signed-off-by: default avatarVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6e18ff1