1. 17 Nov, 2014 2 commits
    • Linus Lüssing's avatar
      bridge: fix netfilter/NF_BR_LOCAL_OUT for own, locally generated queries · f0b4eece
      Linus Lüssing authored
      Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected
      for both locally generated IGMP and MLD queries. The IP header specific
      filter options are off by 14 Bytes for netfilter (actual output on
      interfaces is fine).
      
      NF_HOOK() expects the skb->data to point to the IP header, not the
      ethernet one (while dev_queue_xmit() does not). Luckily there is an
      br_dev_queue_push_xmit() helper function already - let's just use that.
      
      Introduced by eb1d1641
      ("bridge: Add core IGMP snooping support")
      
      Ebtables example:
      
      $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \
      	--log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP
      
      before (broken):
      
      ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
      	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
      	SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \
      	DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \
      	IPv6 priority=0x3, Next Header=2
      
      after (working):
      
      ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
      	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
      	SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \
      	DST=ff02:0000:0000:0000:0000:0000:0000:0001, \
      	IPv6 priority=0x0, Next Header=0
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@web.de>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f0b4eece
    • Pablo Neira Ayuso's avatar
      netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind · 97840cb6
      Pablo Neira Ayuso authored
      Make sure the netlink group exists, otherwise you can trigger an out
      of bound array memory access from the netlink_bind() path. This splat
      can only be triggered only by superuser.
      
      [  180.203600] UBSan: Undefined behaviour in ../net/netfilter/nfnetlink.c:467:28
      [  180.204249] index 9 is out of range for type 'int [9]'
      [  180.204697] CPU: 0 PID: 1771 Comm: trinity-main Not tainted 3.18.0-rc4-mm1+ #122
      [  180.205365] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org
      +04/01/2014
      [  180.206498]  0000000000000018 0000000000000000 0000000000000009 ffff88007bdf7da8
      [  180.207220]  ffffffff82b0ef5f 0000000000000092 ffffffff845ae2e0 ffff88007bdf7db8
      [  180.207887]  ffffffff8199e489 ffff88007bdf7e18 ffffffff8199ea22 0000003900000000
      [  180.208639] Call Trace:
      [  180.208857] dump_stack (lib/dump_stack.c:52)
      [  180.209370] ubsan_epilogue (lib/ubsan.c:174)
      [  180.209849] __ubsan_handle_out_of_bounds (lib/ubsan.c:400)
      [  180.210512] nfnetlink_bind (net/netfilter/nfnetlink.c:467)
      [  180.210986] netlink_bind (net/netlink/af_netlink.c:1483)
      [  180.211495] SYSC_bind (net/socket.c:1541)
      
      Moreover, define the missing nf_tables and nf_acct multicast groups too.
      Reported-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      97840cb6
  2. 16 Nov, 2014 9 commits
    • Daniel Borkmann's avatar
      ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs · feb91a02
      Daniel Borkmann authored
      It has been reported that generating an MLD listener report on
      devices with large MTUs (e.g. 9000) and a high number of IPv6
      addresses can trigger a skb_over_panic():
      
      skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20
      head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0
      dev:port1
       ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:100!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: ixgbe(O)
      CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff80578226>] ? skb_put+0x3a/0x3b
       [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e
       [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4
       [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d
       [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45
       [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68
       [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182
       [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d
       [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3
       [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46
       [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70
      
      mld_newpack() skb allocations are usually requested with dev->mtu
      in size, since commit 72e09ad1 ("ipv6: avoid high order allocations")
      we have changed the limit in order to be less likely to fail.
      
      However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb)
      macros, which determine if we may end up doing an skb_put() for
      adding another record. To avoid possible fragmentation, we check
      the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong
      assumption as the actual max allocation size can be much smaller.
      
      The IGMP case doesn't have this issue as commit 57e1ab6e
      ("igmp: refine skb allocations") stores the allocation size in
      the cb[].
      
      Set a reserved_tailroom to make it fit into the MTU and use
      skb_availroom() helper instead. This also allows to get rid of
      igmp_skb_size().
      Reported-by: default avatarWei Liu <lw1a2.jing@gmail.com>
      Fixes: 72e09ad1 ("ipv6: avoid high order allocations")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: David L Stevens <david.stevens@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      feb91a02
    • Martin Hauke's avatar
      qmi_wwan: Add support for HP lt4112 LTE/HSPA+ Gobi 4G Modem · bb2bdeb8
      Martin Hauke authored
      Added the USB VID/PID for the HP lt4112 LTE/HSPA+ Gobi 4G Modem (Huawei me906e)
      Signed-off-by: default avatarMartin Hauke <mardnh@gmx.de>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb2bdeb8
    • David S. Miller's avatar
      Merge branch 'net_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch · c6ab766e
      David S. Miller authored
      Pravin B Shelar says:
      
      ====================
      Open vSwitch
      
      Following fixes are accumulated in ovs-repo.
      Three of them are related to protocol processing, one is
      related to memory leak in case of error and one is to
      fix race.
      Patch "Validate IPv6 flow key and mask values" has conflicts
      with net-next, Let me know if you want me to send the patch
      for net-next.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6ab766e
    • Anish Bhatt's avatar
      dcbnl : Disable software interrupts before taking dcb_lock · 52cff74e
      Anish Bhatt authored
      Solves possible lockup issues that can be seen from firmware DCB agents calling
      into the DCB app api.
      
      DCB firmware event queues can be tied in with NAPI so that dcb events are
      generated in softIRQ context. This can results in calls to dcb_*app()
      functions which try to take the dcb_lock.
      
      If the the event triggers while we also have the dcb_lock because lldpad or
      some other agent happened to be issuing a  get/set command we could see a cpu
      lockup.
      
      This code was not originally written with firmware agents in mind, hence
      grabbing dcb_lock from softIRQ context was not considered.
      Signed-off-by: default avatarAnish Bhatt <anish@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52cff74e
    • Alexey Khoroshilov's avatar
      ieee802154: fix error handling in ieee802154fake_probe() · 8c2dd544
      Alexey Khoroshilov authored
      In case of any failure ieee802154fake_probe() just calls unregister_netdev().
      But it does not look safe to unregister netdevice before it was registered.
      
      The patch implements straightforward resource deallocation in case of
      failure in ieee802154fake_probe().
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c2dd544
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · f1227c5c
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter updates for your net tree,
      they are:
      
      1) Fix missing initialization of the range structure (allocated in the
         stack) in nft_masq_{ipv4, ipv6}_eval, from Daniel Borkmann.
      
      2) Make sure the data we receive from userspace contains the req_version
         structure, otherwise return an error incomplete on truncated input.
         From Dan Carpenter.
      
      3) Fix handling og skb->sk which may cause incorrect handling
         of connections from a local process. Via Simon Horman, patch from
         Calvin Owens.
      
      4) Fix wrong netns in nft_compat when setting target and match params
         structure.
      
      5) Relax chain type validation in nft_compat that was recently included,
         this broke the matches that need to be run from the route chain type.
         Now iptables-test.py automated regression tests report success again
         and we avoid the only possible problematic case, which is the use of
         nat targets out of nat chain type.
      
      6) Use match->table to validate the tablename, instead of the match->name.
         Again patch for nft_compat.
      
      7) Restore the synchronous release of objects from the commit and abort
         path in nf_tables. This is causing two major problems: splats when using
         nft_compat, given that matches and targets may sleep and call_rcu is
         invoked from softirq context. Moreover Patrick reported possible event
         notification reordering when rules refer to anonymous sets.
      
      8) Fix race condition in between packets that are being confirmed by
         conntrack and the ctnetlink flush operation. This happens since the
         removal of the central spinlock. Thanks to Jesper D. Brouer to looking
         into this.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1227c5c
    • John Ogness's avatar
      drivers: net: cpsw: Fix TX_IN_SEL offset · 35717d8d
      John Ogness authored
      The TX_IN_SEL offset for the CPSW_PORT/TX_IN_CTL register was
      incorrect. This caused the Dual MAC mode to never get set when
      it should. It also caused possible unintentional setting of a
      bit in the CPSW_PORT/TX_BLKS_REM register.
      
      The purpose of setting the Dual MAC mode for this register is to:
      
          "... allow packets from both ethernet ports to be written into
           the FIFO without one port starving the other port."
      					- AM335x ARM TRM
      Signed-off-by: default avatarJohn Ogness <john.ogness@linutronix.de>
      Reviewed-by: default avatarMugunthan V N <mugunthanvnm@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35717d8d
    • Hannes Frederic Sowa's avatar
      reciprocal_div: objects with exported symbols should be obj-y rather than lib-y · 9f458945
      Hannes Frederic Sowa authored
      Otherwise the exported symbols might be discarded because of no users
      in vmlinux.
      Reported-by: default avatarJim Davis <jim.epost@gmail.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f458945
    • Panu Matilainen's avatar
      ipv4: Fix incorrect error code when adding an unreachable route · 49dd18ba
      Panu Matilainen authored
      Trying to add an unreachable route incorrectly returns -ESRCH if
      if custom FIB rules are present:
      
      [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4
      RTNETLINK answers: Network is unreachable
      [root@localhost ~]# ip rule add to 55.66.77.88 table 200
      [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4
      RTNETLINK answers: No such process
      [root@localhost ~]#
      
      Commit 83886b6b ("[NET]: Change "not found"
      return value for rule lookup") changed fib_rules_lookup()
      to use -ESRCH as a "not found" code internally, but for user space it
      should be translated into -ENETUNREACH. Handle the translation centrally in
      ipv4-specific fib_lookup(), leaving the DECnet case alone.
      
      On a related note, commit b7a71b51
      ("ipv4: removed redundant conditional") removed a similar translation from
      ip_route_input_slow() prematurely AIUI.
      
      Fixes: b7a71b51 ("ipv4: removed redundant conditional")
      Signed-off-by: default avatarPanu Matilainen <pmatilai@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49dd18ba
  3. 14 Nov, 2014 29 commits
    • Jarno Rajahalme's avatar
      openvswitch: Validate IPv6 flow key and mask values. · fecaef85
      Jarno Rajahalme authored
      Reject flow label key and mask values with invalid bits set.
      Introduced by commit 3fdbd1ce ("openvswitch: add ipv6 'set'
      action").
      Signed-off-by: default avatarJarno Rajahalme <jrajahalme@nicira.com>
      Acked-by: default avatarJesse Gross <jesse@nicira.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      fecaef85
    • Pravin B Shelar's avatar
      openvswitch: Convert dp rcu read operation to locked operations · 8ec609d8
      Pravin B Shelar authored
      dp read operations depends on ovs_dp_cmd_fill_info(). This API
      needs to looup vport to find dp name, but vport lookup can
      fail. Therefore to keep vport reference alive we need to
      take ovs lock.
      
      Introduced by commit 6093ae9a ("openvswitch: Minimize
      dp and vport critical sections").
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Acked-by: default avatarAndy Zhou <azhou@nicira.com>
      8ec609d8
    • Daniele Di Proietto's avatar
      openvswitch: Fix NDP flow mask validation · 19e7a3df
      Daniele Di Proietto authored
      match_validate() enforce that a mask matching on NDP attributes has also an
      exact match on ICMPv6 type.
      The ICMPv6 type, which is 8-bit wide, is stored in the 'tp.src' field of
      'struct sw_flow_key', which is 16-bit wide.
      Therefore, an exact match on ICMPv6 type should only check the first 8 bits.
      
      This commit fixes a bug that prevented flows with an exact match on NDP field
      from being installed
      Introduced by commit 03f0d916 ("openvswitch: Mega flow implementation").
      Signed-off-by: default avatarDaniele Di Proietto <ddiproietto@vmware.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      19e7a3df
    • Jesse Gross's avatar
      openvswitch: Fix checksum calculation when modifying ICMPv6 packets. · 856447d0
      Jesse Gross authored
      The checksum of ICMPv6 packets uses the IP pseudoheader as part of
      the calculation, unlike ICMP in IPv4. This was not implemented,
      which means that modifying the IP addresses of an ICMPv6 packet
      would cause the checksum to no longer be correct as the psuedoheader
      did not match.
      Introduced by commit 3fdbd1ce ("openvswitch: add ipv6 'set' action").
      Reported-by: default avatarNeal Shrader <icosahedral@gmail.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      856447d0
    • Pravin B Shelar's avatar
      openvswitch: Fix memory leak. · ab64f16f
      Pravin B Shelar authored
      Need to free memory in case of sample action error.
      
      Introduced by commit 651887b0 ("openvswitch: Sample
      action without side effects").
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      ab64f16f
    • David S. Miller's avatar
      Merge branch 'vxlan_gso_check' · b7f4c0ba
      David S. Miller authored
      Joe Stringer says:
      
      ====================
      Implement ndo_gso_check() for vxlan nics
      
      Most NICs that report NETIF_F_GSO_UDP_TUNNEL support VXLAN, and not other
      UDP-based encapsulation protocols where the format and size of the header may
      differ. This patch series implements a generic ndo_gso_check() for detecting
      VXLAN, then reuses it for these NICs.
      
      Implementation shamelessly stolen from Tom Herbert (with minor fixups):
      http://thread.gmane.org/gmane.linux.network/332428/focus=333111
      
      v2: Drop i40e/fm10k patches (code diverged; handling separately).
          Refactor common code into vxlan_gso_check() helper.
          Minor style fixes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7f4c0ba
    • Joe Stringer's avatar
      qlcnic: Implement ndo_gso_check() · 795a05c1
      Joe Stringer authored
      Use vxlan_gso_check() to advertise offload support for this NIC.
      Signed-off-by: default avatarJoe Stringer <joestringer@nicira.com>
      Acked-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      795a05c1
    • Joe Stringer's avatar
      net/mlx4_en: Implement ndo_gso_check() · 956bdab2
      Joe Stringer authored
      Use vxlan_gso_check() to advertise offload support for this NIC.
      Signed-off-by: default avatarJoe Stringer <joestringer@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      956bdab2
    • Joe Stringer's avatar
      be2net: Implement ndo_gso_check() · 725d548f
      Joe Stringer authored
      Use vxlan_gso_check() to advertise offload support for this NIC.
      Signed-off-by: default avatarJoe Stringer <joestringer@nicira.com>
      Acked-by: default avatarSathya Perla <sperla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      725d548f
    • Joe Stringer's avatar
      net: Add vxlan_gso_check() helper · 23e62de3
      Joe Stringer authored
      Most NICs that report NETIF_F_GSO_UDP_TUNNEL support VXLAN, and not
      other UDP-based encapsulation protocols where the format and size of the
      header differs. This patch implements a generic ndo_gso_check() for
      VXLAN which will only advertise GSO support when the skb looks like it
      contains VXLAN (or no UDP tunnelling at all).
      
      Implementation shamelessly stolen from Tom Herbert:
      http://thread.gmane.org/gmane.linux.network/332428/focus=333111Signed-off-by: default avatarJoe Stringer <joestringer@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23e62de3
    • David S. Miller's avatar
      Merge tag 'master-2014-11-11' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · 8a5809e0
      David S. Miller authored
      John W. Linville says:
      
      ====================
      pull request: wireless 2014-11-13
      
      Please pull this set of a few more wireless fixes intended for the
      3.18 stream...
      
      For the mac80211 bits, Johannes says:
      
      "This has just one fix, for an issue with the CCMP decryption
      that can cause a kernel crash. I'm not sure it's remotely
      exploitable, but it's an important fix nonetheless."
      
      For the iwlwifi bits, Emmanuel says:
      
      "Two fixes here - we weren't updating mac80211 if a scan
      was cut short by RFKILL which confused cfg80211. As a
      result, the latter wouldn't allow to run another scan.
      Liad fixes a small bug in the firmware dump."
      
      On top of that...
      
      Arend van Spriel corrects a channel width conversion that caused a
      WARNING in brcmfmac.
      
      Hauke Mehrtens avoids a NULL pointer dereference in b43.
      
      Larry Finger hits a trio of rtlwifi bugs left over from recent
      backporting from the Realtek vendor driver.
      
      Miaoqing Pan fixes a clocking problem in ath9k that could affect
      packet timestamps and such.
      
      Stanislaw Gruszka addresses an payload alignment issue that has been
      plaguing rt2x00.
      
      Please let me know if there are problems!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a5809e0
    • Vincent BENAYOUN's avatar
      inetdevice: fixed signed integer overflow · 84bc8868
      Vincent BENAYOUN authored
      There could be a signed overflow in the following code.
      
      The expression, (32-logmask) is comprised between 0 and 31 included.
      It may be equal to 31.
      In such a case the left shift will produce a signed integer overflow.
      According to the C99 Standard, this is an undefined behavior.
      A simple fix is to replace the signed int 1 with the unsigned int 1U.
      Signed-off-by: default avatarVincent BENAYOUN <vincent.benayoun@trust-in-soft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84bc8868
    • bill bonaparte's avatar
      netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse · 5195c14c
      bill bonaparte authored
      After removal of the central spinlock nf_conntrack_lock, in
      commit 93bb0ceb ("netfilter: conntrack: remove central
      spinlock nf_conntrack_lock"), it is possible to race against
      get_next_corpse().
      
      The race is against the get_next_corpse() cleanup on
      the "unconfirmed" list (a per-cpu list with seperate locking),
      which set the DYING bit.
      
      Fix this race, in __nf_conntrack_confirm(), by removing the CT
      from unconfirmed list before checking the DYING bit.  In case
      race occured, re-add the CT to the dying list.
      
      While at this, fix coding style of the comment that has been
      updated.
      
      Fixes: 93bb0ceb ("netfilter: conntrack: remove central spinlock nf_conntrack_lock")
      Reported-by: default avatarbill bonaparte <programme110@gmail.com>
      Signed-off-by: default avatarbill bonaparte <programme110@gmail.com>
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      5195c14c
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · b23dc5a7
      Linus Torvalds authored
      Pull virtio bugfix from Michael S Tsirkin:
       "This fixes a crash in virtio console multi-channel mode that got
        introduced in -rc1"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_console: move early VQ enablement
      b23dc5a7
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 5cf52037
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) sunhme driver lacks DMA mapping error checks, based upon a report by
          Meelis Roos.
      
       2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee.
      
       3) DMA memory allocation sizes are wrong in systemport ethernet driver,
          fix from Florian Fainelli.
      
       4) Fix use after free in mac80211 defragmentation code, from Johannes
          Berg.
      
       5) Some networking uapi headers missing from Kbuild file, from Stephen
          Hemminger.
      
       6) TUN driver gets csum_start offset wrong when VLAN accel is enabled,
          and macvtap has a similar bug, from Herbert Xu.
      
       7) Adjust several tunneling drivers to set dev->iflink after registry,
          because registry sets that to -1 overwriting whatever we did.  From
          Steffen Klassert.
      
       8) Geneve forgets to set inner tunneling type, causing GSO segmentation
          to fail on some NICs.  From Jesse Gross.
      
       9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and
          Giuseppe CAVALLARO.
      
      10) Fix spurious timeouts with NewReno on low traffic connections, from
          Marcelo Leitner.
      
      11) Fix descriptor updates in enic driver, from Govindarajulu
          Varadarajan.
      
      12) PPP calls bpf_prog_create() with locks held, which isn't kosher.
          Fix from Takashi Iwai.
      
      13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel
          Borkmann.
      
      14) psock_fanout selftest accesses past the end of the mmap ring, fix
          from Shuah Khan.
      
      15) Fix PTP timestamping for VLAN packets, from Richard Cochran.
      
      16) netlink_unbind() calls in netlink pass wrong initial argument, from
          Hiroaki SHIMODA.
      
      17) vxlan socket reuse accidently reuses a socket when the address
          family is different, so we have to explicitly check this, from
          Marcelo Lietner.
      
      18) Fix missing include in nft_reject_bridge.c breaking the build on ppc
          and other architectures, from Guenter Roeck.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
        vxlan: Do not reuse sockets for a different address family
        smsc911x: power-up phydev before doing a software reset.
        lib: rhashtable - Remove weird non-ASCII characters from comments
        net/smsc911x: Fix delays in the PHY enable/disable routines
        net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
        netlink: Properly unbind in error conditions.
        net: ptp: fix time stamp matching logic for VLAN packets.
        cxgb4 : dcb open-lldp interop fixes
        selftests/net: psock_fanout seg faults in sock_fanout_read_ring()
        net: bcmgenet: apply MII configuration in bcmgenet_open()
        net: bcmgenet: connect and disconnect from the PHY state machine
        net: qualcomm: Fix dependency
        ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx
        net: phy: Correctly handle MII ioctl which changes autonegotiation.
        ipv6: fix IPV6_PKTINFO with v4 mapped
        net: sctp: fix memory leak in auth key management
        net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
        net: ppp: Don't call bpf_prog_create() in ppp_lock
        net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set
        cxgb4 : Fix bug in DCB app deletion
        ...
      5cf52037
    • Linus Torvalds's avatar
      Merge branch 'akpm' (fixes from Andrew Morton) · 971ad4e4
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "15 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        MAINTAINERS: add IIO include files
        kernel/panic.c: update comments for print_tainted
        mem-hotplug: reset node present pages when hot-adding a new pgdat
        mem-hotplug: reset node managed pages when hot-adding a new pgdat
        mm/debug-pagealloc: correct freepage accounting and order resetting
        fanotify: fix notification of groups with inode & mount marks
        mm, compaction: prevent infinite loop in compact_zone
        mm: alloc_contig_range: demote pages busy message from warn to info
        mm/slab: fix unalignment problem on Malta with EVA due to slab merge
        mm/page_alloc: restrict max order of merging on isolated pageblock
        mm/page_alloc: move freepage counting logic to __free_one_page()
        mm/page_alloc: add freepage on isolate pageblock to correct buddy list
        mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
        mm/compaction: skip the range until proper target pageblock is met
        zram: avoid kunmap_atomic() of a NULL pointer
      971ad4e4
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · b0ab3f19
      Linus Torvalds authored
      Pull Ceph fixes from Sage Weil:
       "There is an overflow bug fix for cephfs from Zheng, a fix for handling
        large authentication ticket buffers in libceph from Ilya, and a few
        fixes for the request handling code from Ilya that affect RBD volumes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
        libceph: change from BUG to WARN for __remove_osd() asserts
        libceph: clear r_req_lru_item in __unregister_linger_request()
        libceph: unlink from o_linger_requests when clearing r_osd
        libceph: do not crash on large auth tickets
        ceph: fix flush tid comparision
      b0ab3f19
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 6b07974a
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - fix for an oops in HID core upon repeated subdriver insertion/removal
         under certain circumstances, by Benjamin Tissoires
      
       - quirk for another Elan Touchscreen device, by Adel Gadllah
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: core: cleanup .claimed field on disconnect
        HID: usbhid: enable always-poll quirk for Elan Touchscreen 0103
      6b07974a
    • Daniel Baluta's avatar
      MAINTAINERS: add IIO include files · 8fe671fc
      Daniel Baluta authored
      Files under include/linux/iio were not reported as part of the IIO
      subsystem.
      Signed-off-by: default avatarDaniel Baluta <daniel.baluta@intel.com>
      Reported-by: default avatarCristina Ciocan <cristina.ciocan@intel.com>
      Reviewed-by: default avatarJingoo Han <jg1.han@samsung.com>
      Cc: Hartmut Knaack <knaack.h@gmx.de>
      Cc: Lars-Peter Clausen <lars@metafoo.de>
      Cc: Peter Meerwald <pmeerw@pmeerw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fe671fc
    • Xie XiuQi's avatar
      kernel/panic.c: update comments for print_tainted · bc53a3f4
      Xie XiuQi authored
      Commit 69361eef ("panic: add TAINT_SOFTLOCKUP") added the 'L' flag,
      but failed to update the comments for print_tainted().  So, update the
      comments.
      Signed-off-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bc53a3f4
    • Tang Chen's avatar
      mem-hotplug: reset node present pages when hot-adding a new pgdat · 0bd85420
      Tang Chen authored
      When memory is hot-added, all the memory is in offline state.  So clear
      all zones' present_pages because they will be updated in online_pages()
      and offline_pages().  Otherwise, /proc/zoneinfo will corrupt:
      
      When the memory of node2 is offline:
      
        # cat /proc/zoneinfo
        ......
        Node 2, zone   Movable
        ......
              spanned  8388608
              present  8388608
              managed  0
      
      When we online memory on node2:
      
        # cat /proc/zoneinfo
        ......
        Node 2, zone   Movable
        ......
              spanned  8388608
              present  16777216
              managed  8388608
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>	[3.16+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0bd85420
    • Tang Chen's avatar
      mem-hotplug: reset node managed pages when hot-adding a new pgdat · f784a3f1
      Tang Chen authored
      In free_area_init_core(), zone->managed_pages is set to an approximate
      value for lowmem, and will be adjusted when the bootmem allocator frees
      pages into the buddy system.
      
      But free_area_init_core() is also called by hotadd_new_pgdat() when
      hot-adding memory.  As a result, zone->managed_pages of the newly added
      node's pgdat is set to an approximate value in the very beginning.
      
      Even if the memory on that node has node been onlined,
      /sys/device/system/node/nodeXXX/meminfo has wrong value:
      
        hot-add node2 (memory not onlined)
        cat /sys/device/system/node/node2/meminfo
        Node 2 MemTotal:       33554432 kB
        Node 2 MemFree:               0 kB
        Node 2 MemUsed:        33554432 kB
        Node 2 Active:                0 kB
      
      This patch fixes this problem by reset node managed pages to 0 after
      hot-adding a new node.
      
      1. Move reset_managed_pages_done from reset_node_managed_pages() to
         reset_all_zones_managed_pages()
      2. Make reset_node_managed_pages() non-static
      3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
         is initialized
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>	[3.16+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f784a3f1
    • Joonsoo Kim's avatar
      mm/debug-pagealloc: correct freepage accounting and order resetting · 57cbc87e
      Joonsoo Kim authored
      One thing I did in this patch is fixing freepage accounting.  If we
      clear guard page and link it onto isolate buddy list, we should not
      increase freepage count.  This patch adds conditional branch to skip
      counting in this case.  Without this patch, this overcounting happens
      frequently if guard order is set and CMA is used.
      
      Another thing fixed in this patch is the target to reset order.  In
      __free_one_page(), we check the buddy page whether it is a guard page or
      not.  And, if so, we should clear guard attribute on the buddy page and
      reset order of it to 0.  But, current code resets original page's order
      rather than buddy one's.  Maybe, this doesn't have any problem, because
      whole merged page's order will be re-assigned soon.  But, it is better
      to correct code.
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Gioh Kim <gioh.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      57cbc87e
    • Jan Kara's avatar
      fanotify: fix notification of groups with inode & mount marks · 8edc6e16
      Jan Kara authored
      fsnotify() needs to merge inode and mount marks lists when notifying
      groups about events so that ignore masks from inode marks are reflected
      in mount mark notifications and groups are notified in proper order
      (according to priorities).
      
      Currently the sorting of the lists done by fsnotify_add_inode_mark() /
      fsnotify_add_vfsmount_mark() and fsnotify() differed which resulted
      ignore masks not being used in some cases.
      
      Fix the problem by always using the same comparison function when
      sorting / merging the mark lists.
      
      Thanks to Heinrich Schuchardt for improvements of my patch.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=87721Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reported-by: default avatarHeinrich Schuchardt <xypron.glpk@gmx.de>
      Tested-by: default avatarHeinrich Schuchardt <xypron.glpk@gmx.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8edc6e16
    • Vlastimil Babka's avatar
      mm, compaction: prevent infinite loop in compact_zone · 1d5bfe1f
      Vlastimil Babka authored
      Several people have reported occasionally seeing processes stuck in
      compact_zone(), even triggering soft lockups, in 3.18-rc2+.
      
      Testing a revert of commit e14c720e ("mm, compaction: remember
      position within pageblock in free pages scanner") fixed the issue,
      although the stuck processes do not appear to involve the free scanner.
      
      Finally, by code inspection, the bug was found in isolate_migratepages()
      which uses a slightly different condition to detect if the migration and
      free scanners have met, than compact_finished().  That has not been a
      problem until commit e14c720e allowed the free scanner position
      between individual invocations to be in the middle of a pageblock.
      
      In a relatively rare case, the migration scanner position can end up at
      the beginning of a pageblock, with the free scanner position in the
      middle of the same pageblock.  If it's the migration scanner's turn,
      isolate_migratepages() exits immediately (without updating the
      position), while compact_finished() decides to continue compaction,
      resulting in a potentially infinite loop.  The system can recover only
      if another process creates enough high-order pages to make the watermark
      checks in compact_finished() pass.
      
      This patch fixes the immediate problem by bumping the migration
      scanner's position to meet the free scanner in isolate_migratepages(),
      when both are within the same pageblock.  This causes compact_finished()
      to terminate properly.  A more robust check in compact_finished() is
      planned as a cleanup for better future maintainability.
      
      Fixes: e14c720e ("mm, compaction: remember position within pageblock in free pages scanner)
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarP. Christeas <xrg@linux.gr>
      Tested-by: default avatarP. Christeas <xrg@linux.gr>
      Link: http://marc.info/?l=linux-mm&m=141508604232522&w=2Reported-by: default avatarNorbert Preining <preining@logic.at>
      Tested-by: default avatarNorbert Preining <preining@logic.at>
      Link: https://lkml.org/lkml/2014/11/4/904Reported-by: default avatarPavel Machek <pavel@ucw.cz>
      Link: https://lkml.org/lkml/2014/11/7/164
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1d5bfe1f
    • Michal Nazarewicz's avatar
      mm: alloc_contig_range: demote pages busy message from warn to info · dae803e1
      Michal Nazarewicz authored
      Having test_pages_isolated failure message as a warning confuses users
      into thinking that it is more serious than it really is.  In reality, if
      called via CMA, allocation will be retried so a single
      test_pages_isolated failure does not prevent allocation from succeeding.
      
      Demote the warning message to an info message and reformat it such that
      the text "failed" does not appear and instead a less worrying "PFNS
      busy" is used.
      
      This message is trivially reproducible on a 10GB x86 machine on 3.16.y
      kernels configured with CONFIG_DMA_CMA.
      Signed-off-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dae803e1
    • Joonsoo Kim's avatar
      mm/slab: fix unalignment problem on Malta with EVA due to slab merge · 95069ac8
      Joonsoo Kim authored
      Unlike SLUB, sometimes, object isn't started at the beginning of the
      slab in SLAB.  This causes the unalignment problem after slab merging is
      supported by commit 12220dea ("mm/slab: support slab merge").
      
      Following is the report from Markos that fail to boot on Malta with EVA.
      
          Calibrating delay loop... 19.86 BogoMIPS (lpj=99328)
          pid_max: default: 32768 minimum: 301
          Mount-cache hash table entries: 4096 (order: 0, 16384 bytes)
          Mountpoint-cache hash table entries: 4096 (order: 0, 16384 bytes)
          Kernel bug detected[#1]:
          CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0-05639-g12220dea #1631
          task: 1f04f5d8 ti: 1f050000 task.ti: 1f050000
          epc   : 80141190 alloc_unbound_pwq+0x234/0x304
              Not tainted
          ra    : 80141184 alloc_unbound_pwq+0x228/0x304
          Process swapper/0 (pid: 1, threadinfo=1f050000, task=1f04f5d8, tls=00000000)
          Call Trace:
            alloc_unbound_pwq+0x234/0x304
            apply_workqueue_attrs+0x11c/0x294
            __alloc_workqueue_key+0x23c/0x470
            init_workqueues+0x320/0x400
            do_one_initcall+0xe8/0x23c
            kernel_init_freeable+0x9c/0x224
            kernel_init+0x10/0x100
            ret_from_kernel_thread+0x14/0x1c
          [ end trace cb88537fdc8fa200 ]
          Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      
      alloc_unbound_pwq() allocates slab object from pool_workqueue.  This
      kmem_cache requires 256 bytes alignment, but, current merging code
      doesn't honor that, and merge it with kmalloc-256.  kmalloc-256 requires
      only cacheline size alignment so that above failure occurs.  However, in
      x86, kmalloc-256 is luckily aligned in 256 bytes, so the problem didn't
      happen on it.
      
      To fix this problem, this patch introduces alignment mismatch check in
      find_mergeable().  This will fix the problem.
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Reported-by: default avatarMarkos Chandras <Markos.Chandras@imgtec.com>
      Tested-by: default avatarMarkos Chandras <Markos.Chandras@imgtec.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      95069ac8
    • Joonsoo Kim's avatar
      mm/page_alloc: restrict max order of merging on isolated pageblock · 3c605096
      Joonsoo Kim authored
      Current pageblock isolation logic could isolate each pageblock
      individually.  This causes freepage accounting problem if freepage with
      pageblock order on isolate pageblock is merged with other freepage on
      normal pageblock.  We can prevent merging by restricting max order of
      merging to pageblock order if freepage is on isolate pageblock.
      
      A side-effect of this change is that there could be non-merged buddy
      freepage even if finishing pageblock isolation, because undoing
      pageblock isolation is just to move freepage from isolate buddy list to
      normal buddy list rather than to consider merging.  So, the patch also
      makes undoing pageblock isolation consider freepage merge.  When
      un-isolation, freepage with more than pageblock order and it's buddy are
      checked.  If they are on normal pageblock, instead of just moving, we
      isolate the freepage and free it in order to get merged.
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Heesub Shin <heesub.shin@samsung.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Ritesh Harjani <ritesh.list@gmail.com>
      Cc: Gioh Kim <gioh.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3c605096
    • Joonsoo Kim's avatar
      mm/page_alloc: move freepage counting logic to __free_one_page() · 8f82b55d
      Joonsoo Kim authored
      All the caller of __free_one_page() has similar freepage counting logic,
      so we can move it to __free_one_page().  This reduce line of code and
      help future maintenance.
      
      This is also preparation step for "mm/page_alloc: restrict max order of
      merging on isolated pageblock" which fix the freepage counting problem
      on freepage with more than pageblock order.
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Heesub Shin <heesub.shin@samsung.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Ritesh Harjani <ritesh.list@gmail.com>
      Cc: Gioh Kim <gioh.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f82b55d