1. 24 Jan, 2018 10 commits
    • David S. Miller's avatar
      Merge branch 'kcm-fix-two-syzcaller-issues' · 88d1d76d
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      kcm: fix two syzcaller issues
      
      In this patch set:
      
      - Don't allow attaching non-TCP or listener sockets to a KCM mux.
      - In kcm_attach Check if sk_user_data is already set. This is
        under lock to avoid race conditions. More work is need to make
        all of the users of sk_user_data to use the same locking.
      
      - v2
        Remove unncessary check for not PF_KCM in kcm_attach (suggested by
        Guillaume Nault)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88d1d76d
    • Tom Herbert's avatar
      kcm: Check if sk_user_data already set in kcm_attach · e5571240
      Tom Herbert authored
      This is needed to prevent sk_user_data being overwritten.
      The check is done under the callback lock. This should prevent
      a socket from being attached twice to a KCM mux. It also prevents
      a socket from being attached for other use cases of sk_user_data
      as long as the other cases set sk_user_data under the lock.
      Followup work is needed to unify all the use cases of sk_user_data
      to use the same locking.
      
      Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Signed-off-by: default avatarTom Herbert <tom@quantonium.net>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5571240
    • Tom Herbert's avatar
      kcm: Only allow TCP sockets to be attached to a KCM mux · 581e7226
      Tom Herbert authored
      TCP sockets for IPv4 and IPv6 that are not listeners or in closed
      stated are allowed to be attached to a KCM mux.
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com
      Signed-off-by: default avatarTom Herbert <tom@quantonium.net>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      581e7226
    • Wolfgang Bumiller's avatar
      net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr · d3303a65
      Wolfgang Bumiller authored
      TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as
      skb->data points to the network header.
      Use skb_mac_header instead.
      Signed-off-by: default avatarWolfgang Bumiller <w.bumiller@proxmox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3303a65
    • Wolfgang Bumiller's avatar
      net: sched: em_nbyte: don't add the data offset twice · 560a6607
      Wolfgang Bumiller authored
      'ptr' is shifted by the offset and then validated,
      the memcmp should not add it a second time.
      Signed-off-by: default avatarWolfgang Bumiller <w.bumiller@proxmox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      560a6607
    • Yuval Mintz's avatar
      mlxsw: spectrum_router: Don't log an error on missing neighbor · 1ecdaea0
      Yuval Mintz authored
      Driver periodically samples all neighbors configured in device
      in order to update the kernel regarding their state. When finding
      an entry configured in HW that doesn't show in neigh_lookup()
      driver logs an error message.
      This introduces a race when removing multiple neighbors -
      it's possible that a given entry would still be configured in HW
      as its removal is still being processed but is already removed
      from the kernel's neighbor tables.
      
      Simply remove the error message and gracefully accept such events.
      
      Fixes: c723c735 ("mlxsw: spectrum_router: Periodically update the kernel's neigh table")
      Fixes: 60f040ca ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours")
      Signed-off-by: default avatarYuval Mintz <yuvalm@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ecdaea0
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 97edf7c5
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2018-01-24
      
      1) Only offloads SAs after they are fully initialized.
         Otherwise a NIC may receive packets on a SA we can
         not yet handle in the stack.
         From Yossi Kuperman.
      
      2) Fix negative refcount in case of a failing offload.
         From Aviad Yehezkel.
      
      3) Fix inner IP ptoro version when decapsulating
         from interaddress family tunnels.
         From Yossi Kuperman.
      
      4) Use true or false for boolean variables instead of an
         integer value in xfrm_get_type_offload.
         From Gustavo A. R. Silva.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97edf7c5
    • Neil Horman's avatar
      vmxnet3: repair memory leak · 848b1598
      Neil Horman authored
      with the introduction of commit
      b0eb57cb, it appears that rq->buf_info
      is improperly handled.  While it is heap allocated when an rx queue is
      setup, and freed when torn down, an old line of code in
      vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0]
      being set to NULL prior to its being freed, causing a memory leak, which
      eventually exhausts the system on repeated create/destroy operations
      (for example, when  the mtu of a vmxnet3 interface is changed
      frequently.
      
      Fix is pretty straight forward, just move the NULL set to after the
      free.
      
      Tested by myself with successful results
      
      Applies to net, and should likely be queued for stable, please
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-By: boyang@redhat.com
      CC: boyang@redhat.com
      CC: Shrikrishna Khare <skhare@vmware.com>
      CC: "VMware, Inc." <pv-drivers@vmware.com>
      CC: David S. Miller <davem@davemloft.net>
      Acked-by: default avatarShrikrishna Khare <skhare@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      848b1598
    • Ben Hutchings's avatar
      ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL · e9191ffb
      Ben Hutchings authored
      Commit 513674b5 ("net: reevalulate autoflowlabel setting after
      sysctl setting") removed the initialisation of
      ipv6_pinfo::autoflowlabel and added a second flag to indicate
      whether this field or the net namespace default should be used.
      
      The getsockopt() handling for this case was not updated, so it
      currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
      not explicitly enabled.  Fix it to return the effective value, whether
      that has been set at the socket or net namespace level.
      
      Fixes: 513674b5 ("net: reevalulate autoflowlabel setting after sysctl ...")
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9191ffb
    • Guillaume Nault's avatar
      pppoe: take ->needed_headroom of lower device into account on xmit · 02612bb0
      Guillaume Nault authored
      In pppoe_sendmsg(), reserving dev->hard_header_len bytes of headroom
      was probably fine before the introduction of ->needed_headroom in
      commit f5184d26 ("net: Allow netdevices to specify needed head/tailroom").
      
      But now, virtual devices typically advertise the size of their overhead
      in dev->needed_headroom, so we must also take it into account in
      skb_reserve().
      Allocation size of skb is also updated to take dev->needed_tailroom
      into account and replace the arbitrary 32 bytes with the real size of
      a PPPoE header.
      
      This issue was discovered by syzbot, who connected a pppoe socket to a
      gre device which had dev->header_ops->create == ipgre_header and
      dev->hard_header_len == 0. Therefore, PPPoE didn't reserve any
      headroom, and dev_hard_header() crashed when ipgre_header() tried to
      prepend its header to skb->data.
      
      skbuff: skb_under_panic: text:000000001d390b3a len:31 put:24
      head:00000000d8ed776f data:000000008150e823 tail:0x7 end:0xc0 dev:gre0
      ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:104!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
          (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 3670 Comm: syzkaller801466 Not tainted
      4.15.0-rc7-next-20180115+ #97
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:skb_panic+0x162/0x1f0 net/core/skbuff.c:100
      RSP: 0018:ffff8801d9bd7840 EFLAGS: 00010282
      RAX: 0000000000000083 RBX: ffff8801d4f083c0 RCX: 0000000000000000
      RDX: 0000000000000083 RSI: 1ffff1003b37ae92 RDI: ffffed003b37aefc
      RBP: ffff8801d9bd78a8 R08: 1ffff1003b37ae8a R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff86200de0
      R13: ffffffff84a981ad R14: 0000000000000018 R15: ffff8801d2d34180
      FS:  00000000019c4880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000208bc000 CR3: 00000001d9111001 CR4: 00000000001606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        skb_under_panic net/core/skbuff.c:114 [inline]
        skb_push+0xce/0xf0 net/core/skbuff.c:1714
        ipgre_header+0x6d/0x4e0 net/ipv4/ip_gre.c:879
        dev_hard_header include/linux/netdevice.h:2723 [inline]
        pppoe_sendmsg+0x58e/0x8b0 drivers/net/ppp/pppoe.c:890
        sock_sendmsg_nosec net/socket.c:630 [inline]
        sock_sendmsg+0xca/0x110 net/socket.c:640
        sock_write_iter+0x31a/0x5d0 net/socket.c:909
        call_write_iter include/linux/fs.h:1775 [inline]
        do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653
        do_iter_write+0x154/0x540 fs/read_write.c:932
        vfs_writev+0x18a/0x340 fs/read_write.c:977
        do_writev+0xfc/0x2a0 fs/read_write.c:1012
        SYSC_writev fs/read_write.c:1085 [inline]
        SyS_writev+0x27/0x30 fs/read_write.c:1082
        entry_SYSCALL_64_fastpath+0x29/0xa0
      
      Admittedly PPPoE shouldn't be allowed to run on non Ethernet-like
      interfaces, but reserving space for ->needed_headroom is a more
      fundamental issue that needs to be addressed first.
      
      Same problem exists for __pppoe_xmit(), which also needs to take
      dev->needed_headroom into account in skb_cow_head().
      
      Fixes: f5184d26 ("net: Allow netdevices to specify needed head/tailroom")
      Reported-by: syzbot+ed0838d0fa4c4f2b528e20286e6dc63effc7c14d@syzkaller.appspotmail.com
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02612bb0
  2. 23 Jan, 2018 4 commits
  3. 22 Jan, 2018 18 commits
  4. 21 Jan, 2018 8 commits
    • Talat Batheesh's avatar
      net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare · e58edaa4
      Talat Batheesh authored
      Helmut reported a bug about division by zero while
      running traffic and doing physical cable pull test.
      
      When the cable unplugged the ppms become zero, so when
      dividing the current ppms by the previous ppms in the
      next dim iteration there is division by zero.
      
      This patch prevent this division for both ppms and epms.
      
      Fixes: c3164d2f ("net/mlx5e: Added BW check for DIM decision mechanism")
      Reported-by: default avatarHelmut Grauer <helmut.grauer@de.ibm.com>
      Signed-off-by: default avatarTalat Batheesh <talatb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e58edaa4
    • Linus Torvalds's avatar
      Linux 4.15-rc9 · 0c5b9b5d
      Linus Torvalds authored
      0c5b9b5d
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 55151142
      Linus Torvalds authored
      Pull x86 pti fixes from Thomas Gleixner:
       "A small set of fixes for the meltdown/spectre mitigations:
      
         - Make kprobes aware of retpolines to prevent probes in the retpoline
           thunks.
      
         - Make the machine check exception speculation protected. MCE used to
           issue an indirect call directly from the ASM entry code. Convert
           that to a direct call into a C-function and issue the indirect call
           from there so the compiler can add the retpoline protection,
      
         - Make the vmexit_fill_RSB() assembly less stupid
      
         - Fix a typo in the PTI documentation"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
        x86/pti: Document fix wrong index
        kprobes/x86: Disable optimizing on the function jumps to indirect thunk
        kprobes/x86: Blacklist indirect thunk functions for kprobes
        retpoline: Introduce start/end markers of indirect thunk
        x86/mce: Make machine check speculation protected
      55151142
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 319f1e04
      Linus Torvalds authored
      Pull x86 kexec fix from Thomas Gleixner:
       "A single fix for the WBINVD issue introduced by the SME support which
        causes kexec fails on non AMD/SME capable CPUs. Issue WBINVD only when
        the CPU has SME and avoid doing so in a loop"
      
      [ Side note: this patch fixes the problem, but it isn't entirely clear
        why it is required. The wbinvd should just work regardless, but there
        seems to be some system - as opposed to CPU - issue, since the wbinvd
        causes more problems later in the shutdown sequence, but wbinvd
        instructions while the system is still active are not problematic.
      
        Possibly some SMI or pending machine check issue on the affected system ]
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Rework wbinvd, hlt operation in stop_this_cpu()
      319f1e04
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 66f81624
      Linus Torvalds authored
      Pull irq fix from Thomas Gleixner:
       "A single fix for the new matrix allocator to prevent vector exhaustion
        by certain network drivers which allocate gazillions of unused vectors
        which cannot be put into reservation mode due to MSI and the lack of
        MSI entry masking.
      
        The fix/workaround is to spread the vectors across CPUs by searching
        the supplied target CPU mask for the CPU with the smallest number of
        allocated vectors"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irq/matrix: Spread interrupts on allocation
      66f81624
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha · d517bb79
      Linus Torvalds authored
      Pull alpha fixes from Matt Turner:
       "A build fix and a regression fix"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
        alpha/PCI: Fix noname IRQ level detection
        alpha: extend memset16 to EV6 optimised routines
      d517bb79
    • Laura Abbott's avatar
      x86: Use __nostackprotect for sme_encrypt_kernel · 91cfc88c
      Laura Abbott authored
      Commit bacf6b49 ("x86/mm: Use a struct to reduce parameters for SME
      PGD mapping") moved some parameters into a structure.
      
      The structure was large enough to trigger the stack protection canary in
      sme_encrypt_kernel which doesn't work this early, causing reboots.
      
      Mark sme_encrypt_kernel appropriately to not use the canary.
      
      Fixes: bacf6b49 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping")
      Signed-off-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      91cfc88c
    • Lorenzo Pieralisi's avatar
      alpha/PCI: Fix noname IRQ level detection · 86be8993
      Lorenzo Pieralisi authored
      The conversion of the alpha architecture PCI host bridge legacy IRQ
      mapping/swizzling to the new PCI host bridge map/swizzle hooks carried
      out through:
      
      commit 0e4c2eeb ("alpha/PCI: Replace pci_fixup_irqs() call with
      host bridge IRQ mapping hooks")
      
      implies that IRQ for devices are now allocated through pci_assign_irq()
      function in pci_device_probe() that is called when a driver matching a
      device is found in order to probe the device through the device driver.
      
      Alpha noname platforms required IRQ level programming to be executed
      in sio_fixup_irq_levels(), that is called in noname_init_pci(), a
      platform hook called within a subsys_initcall.
      
      In noname_init_pci(), present IRQs are detected through
      sio_collect_irq_levels() that check the struct pci_dev->irq number
      to detect if an IRQ has been allocated for the device.
      
      By the time sio_collect_irq_levels() is called, some devices may still
      have not a matching driver loaded to match them (eg loadable module)
      therefore their IRQ allocation is still pending - which means that
      sio_collect_irq_levels() does not programme the correct IRQ level for
      those devices, causing their IRQ handling to be broken when the device
      driver is actually loaded and the device is probed.
      
      Fix the issue by adding code in the noname map_irq() function
      (noname_map_irq()) that, whilst mapping/swizzling the IRQ line, it also
      ensures that the correct IRQ level programming is executed at platform
      level, fixing the issue.
      
      Fixes: 0e4c2eeb ("alpha/PCI: Replace pci_fixup_irqs() call with
      host bridge IRQ mapping hooks")
      Reported-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: stable@vger.kernel.org # 4.14
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Meelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarMatt Turner <mattst88@gmail.com>
      86be8993