1. 17 Feb, 2017 2 commits
  2. 16 Feb, 2017 10 commits
  3. 15 Feb, 2017 11 commits
    • Thomas Falcon's avatar
      ibmvnic: Fix endian errors in error reporting output · 75224c93
      Thomas Falcon authored
      Error reports received from firmware were not being converted from
      big endian values, leading to bogus error codes reported on little
      endian systems.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75224c93
    • Thomas Falcon's avatar
      ibmvnic: Fix endian error when requesting device capabilities · 28f4d165
      Thomas Falcon authored
      When a vNIC client driver requests a faulty device setting, the
      server returns an acceptable value for the client to request.
      This 64 bit value was incorrectly being swapped as a 32 bit value,
      resulting in loss of data. This patch corrects that by using
      the 64 bit swap function.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28f4d165
    • Marcus Huewe's avatar
      net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification · 7627ae60
      Marcus Huewe authored
      When setting a neigh related sysctl parameter, we always send a
      NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when
      executing
      
      	sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000
      
      a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated.
      
      This is caused by commit 2a4501ae ("neigh: Send a
      notification when DELAY_PROBE_TIME changes"). According to the
      commit's description, it was intended to generate such an event
      when setting the "delay_first_probe_time" sysctl parameter.
      
      In order to fix this, only generate this event when actually
      setting the "delay_first_probe_time" sysctl parameter. This fix
      should not have any unintended side-effects, because all but one
      registered netevent callbacks check for other netevent event
      types (the registered callbacks were obtained by grepping for
      "register_netevent_notifier"). The only callback that uses the
      NETEVENT_DELAY_PROBE_TIME_UPDATE event is
      mlxsw_sp_router_netevent_event() (in
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case
      of this event, it only accesses the DELAY_PROBE_TIME of the
      passed neigh_parms.
      
      Fixes: 2a4501ae ("neigh: Send a notification when DELAY_PROBE_TIME changes")
      Signed-off-by: default avatarMarcus Huewe <suse-tux@gmx.de>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7627ae60
    • Anssi Hannula's avatar
      net: xilinx_emaclite: fix freezes due to unordered I/O · acf138f1
      Anssi Hannula authored
      The xilinx_emaclite uses __raw_writel and __raw_readl for register
      accesses. Those functions do not imply any kind of memory barriers and
      they may be reordered.
      
      The driver does not seem to take that into account, though, and the
      driver does not satisfy the ordering requirements of the hardware.
      For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read()
      which try to set MDIO address before initiating the transaction.
      
      I'm seeing system freezes with the driver with GCC 5.4 and current
      Linux kernels on Zynq-7000 SoC immediately when trying to use the
      interface.
      
      In commit 123c1407 ("net: emaclite: Do not use microblaze and ppc
      IO functions") the driver was switched from non-generic
      in_be32/out_be32 (memory barriers, big endian) to
      __raw_readl/__raw_writel (no memory barriers, native endian), so
      apparently the device follows system endianness and the driver was
      originally written with the assumption of memory barriers.
      
      Rather than try to hunt for each case of missing barrier, just switch
      the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending
      on endianness instead.
      
      Tested on little-endian Zynq-7000 ARM SoC FPGA.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Fixes: 123c1407 ("net: emaclite: Do not use microblaze and ppc IO
      functions")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acf138f1
    • Anssi Hannula's avatar
      net: xilinx_emaclite: fix receive buffer overflow · cd224553
      Anssi Hannula authored
      xilinx_emaclite looks at the received data to try to determine the
      Ethernet packet length but does not properly clamp it if
      proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer
      overflow and a panic via skb_panic() as the length exceeds the allocated
      skb size.
      
      Fix those cases.
      
      Also add an additional unconditional check with WARN_ON() at the end.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Fixes: bb81b2dd ("net: add Xilinx emac lite device driver")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd224553
    • Yinghai Lu's avatar
      PCI/PME: Restore pcie_pme_driver.remove · afe3e4d1
      Yinghai Lu authored
      In addition to making PME non-modular, d7def204 ("PCI/PME: Make
      explicitly non-modular") removed the pcie_pme_driver .remove() method,
      pcie_pme_remove().
      
      pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe().
      The fact that we don't free the IRQ after d7def204 causes the following
      crash when removing a PCIe port device via /sys:
      
        ------------[ cut here ]------------
        kernel BUG at drivers/pci/msi.c:370!
        invalid opcode: 0000 [#1] SMP
        Modules linked in:
        CPU: 1 PID: 14509 Comm: sh Tainted: G    W  4.8.0-rc1-yh-00012-gd29438d6
        RIP: 0010:[<ffffffff9758bbf5>]  free_msi_irqs+0x65/0x190
        ...
        Call Trace:
         [<ffffffff9758cda4>] pci_disable_msi+0x34/0x40
         [<ffffffff97583817>] cleanup_service_irqs+0x27/0x30
         [<ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40
         [<ffffffff97584250>] pcie_portdrv_remove+0x40/0x50
         [<ffffffff97576d7b>] pci_device_remove+0x4b/0xc0
         [<ffffffff9785ebe6>] __device_release_driver+0xb6/0x150
         [<ffffffff9785eca5>] device_release_driver+0x25/0x40
         [<ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0
         [<ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30
         [<ffffffff97578810>] remove_store+0x50/0x70
         [<ffffffff9785a378>] dev_attr_store+0x18/0x30
         [<ffffffff97260b64>] sysfs_kf_write+0x44/0x60
         [<ffffffff9725feae>] kernfs_fop_write+0x10e/0x190
         [<ffffffff971e13f8>] __vfs_write+0x28/0x110
         [<ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80
         [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
         [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
         [<ffffffff971e1f04>] vfs_write+0xc4/0x180
         [<ffffffff971e3089>] SyS_write+0x49/0xa0
         [<ffffffff97001a46>] do_syscall_64+0xa6/0x1b0
         [<ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25
        ...
         RIP  [<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190
         RSP <ffff89ad3085bc48>
        ---[ end trace f4505e1dac5b95d3 ]---
        Segmentation fault
      
      Restore pcie_pme_remove().
      
      [bhelgaas: changelog]
      Fixes: d7def204 ("PCI/PME: Make explicitly non-modular")
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      CC: stable@vger.kernel.org	# v4.9+
      afe3e4d1
    • Pierre-Louis Bossart's avatar
      drm/dp/mst: fix kernel oops when turning off secondary monitor · bb08c04d
      Pierre-Louis Bossart authored
      100% reproducible issue found on SKL SkullCanyon NUC with two external
      DP daisy-chained monitors in DP/MST mode. When turning off or changing
      the input of the second monitor the machine stops with a kernel
      oops. This issue happened with 4.8.8 as well as drm/drm-intel-nightly.
      
      This issue is traced to an inconsistent control flow in
      drm_dp_update_payload_part1(): the 'port' pointer is set to NULL at the
      same time as 'req_payload.num_slots' is set to zero, but the pointer is
      dereferenced even when req_payload.num_slot is zero.
      
      The problematic dereference was introduced in commit dfda0df3
      ("drm/mst: rework payload table allocation to conform better") and may
      impact all versions since v3.18
      
      The fix suggested by Chris Wilson removes the kernel oops and was found to
      work well after 10mn of monkey-testing with the second monitor power and
      input buttons
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98990
      Fixes: dfda0df3 ("drm/mst: rework payload table allocation to conform better.")
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Nathan D Ciobanu <nathan.d.ciobanu@linux.intel.com>
      Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Cc: Sean Paul <seanpaul@chromium.org>
      Cc: <stable@vger.kernel.org> # v3.18+
      Tested-by: default avatarNathan D Ciobanu <nathan.d.ciobanu@linux.intel.com>
      Reviewed-by: default avatarDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1487076561-2169-1-git-send-email-jani.nikula@intel.com
      bb08c04d
    • Sahitya Tummala's avatar
      fuse: fix use after free issue in fuse_dev_do_read() · 6ba4d272
      Sahitya Tummala authored
      There is a potential race between fuse_dev_do_write()
      and request_wait_answer() contexts as shown below:
      
      TASK 1:
      __fuse_request_send():
        |--spin_lock(&fiq->waitq.lock);
        |--queue_request();
        |--spin_unlock(&fiq->waitq.lock);
        |--request_wait_answer():
             |--if (test_bit(FR_SENT, &req->flags))
             <gets pre-empted after it is validated true>
                                         TASK 2:
                                         fuse_dev_do_write():
                                           |--clears bit FR_SENT,
                                           |--request_end():
                                              |--sets bit FR_FINISHED
                                              |--spin_lock(&fiq->waitq.lock);
                                              |--list_del_init(&req->intr_entry);
                                              |--spin_unlock(&fiq->waitq.lock);
                                              |--fuse_put_request();
             |--queue_interrupt();
             <request gets queued to interrupts list>
                  |--wake_up_locked(&fiq->waitq);
             |--wait_event_freezable();
             <as FR_FINISHED is set, it returns and then
             the caller frees this request>
      
      Now, the next fuse_dev_do_read(), see interrupts list is not empty
      and then calls fuse_read_interrupt() which tries to access the request
      which is already free'd and gets the below crash:
      
      [11432.401266] Unable to handle kernel paging request at virtual address
      6b6b6b6b6b6b6b6b
      ...
      [11432.418518] Kernel BUG at ffffff80083720e0
      [11432.456168] PC is at __list_del_entry+0x6c/0xc4
      [11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474
      ...
      [11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4
      [11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474
      [11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78
      [11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8
      [11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108
      [11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94
      
      As FR_FINISHED bit is set before deleting the intr_entry with input
      queue lock in request completion path, do the testing of this flag and
      queueing atomically with the same lock in queue_interrupt().
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: fd22d62e ("fuse: no fc->lock for iqueue parts")
      Cc: <stable@vger.kernel.org> # 4.2+
      6ba4d272
    • Stephen Rothwell's avatar
    • Eric Dumazet's avatar
      tcp: tcp_probe: use spin_lock_bh() · e70ac171
      Eric Dumazet authored
      tcp_rcv_established() can now run in process context.
      
      We need to disable BH while acquiring tcp probe spinlock,
      or risk a deadlock.
      
      Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarRicardo Nabinger Sanchez <rnsanchez@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e70ac171
    • Dmitry V. Levin's avatar
      uapi: fix linux/if_pppol2tp.h userspace compilation errors · a725eb15
      Dmitry V. Levin authored
      Because of <linux/libc-compat.h> interface limitations, <netinet/in.h>
      provided by libc cannot be included after <linux/in.h>, therefore any
      header that includes <netinet/in.h> cannot be included after <linux/in.h>.
      
      Change uapi/linux/l2tp.h, the last uapi header that includes
      <netinet/in.h>, to include <linux/in.h> and <linux/in6.h> instead of
      <netinet/in.h> and use __SOCK_SIZE__ instead of sizeof(struct sockaddr)
      the same way as uapi/linux/in.h does, to fix linux/if_pppol2tp.h userspace
      compilation errors like this:
      
      In file included from /usr/include/linux/l2tp.h:12:0,
                       from /usr/include/linux/if_pppol2tp.h:21,
      /usr/include/netinet/in.h:31:8: error: redefinition of 'struct in_addr'
      
      Fixes: 47c3e778 ("net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*")
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a725eb15
  4. 14 Feb, 2017 17 commits
    • Mauro Carvalho Chehab's avatar
      [media] siano: make it work again with CONFIG_VMAP_STACK · f9c85ee6
      Mauro Carvalho Chehab authored
      Reported as a Kaffeine bug:
      	https://bugs.kde.org/show_bug.cgi?id=375811
      
      The USB control messages require DMA to work. We cannot pass
      a stack-allocated buffer, as it is not warranted that the
      stack would be into a DMA enabled area.
      
      On Kernel 4.9, the default is to not accept DMA on stack anymore
      on x86 architecture. On other architectures, this has been a
      requirement since Kernel 2.2. So, after this patch, this driver
      should likely work fine on all archs.
      
      Tested with USB ID 2040:5510: Hauppauge Windham
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      f9c85ee6
    • Eric Dumazet's avatar
      packet: fix races in fanout_add() · d199fab6
      Eric Dumazet authored
      Multiple threads can call fanout_add() at the same time.
      
      We need to grab fanout_mutex earlier to avoid races that could
      lead to one thread freeing po->rollover that was set by another thread.
      
      Do the same in fanout_release(), for peace of mind, and to help us
      finding lockdep issues earlier.
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Fixes: 0648ab70 ("packet: rollover prepare: per-socket state")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d199fab6
    • Thomas Falcon's avatar
      ibmvnic: Fix initial MTU settings · f39f0d1e
      Thomas Falcon authored
      In the current driver, the MTU is set to the maximum value
      capable for the backing device. This decision turned out to
      be a mistake as it led to confusion among users. The expected
      initial MTU value used for other IBM vNIC capable operating
      systems is 1500, with the maximum value (9000) reserved for
      when Jumbo frames are enabled. This patch sets the MTU to
      the default value for a net device.
      
      It also corrects a discrepancy between MTU values received from
      firmware, which includes the ethernet header length, and net
      device MTU values.
      
      Finally, it removes redundant min/max MTU assignments after device
      initialization.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f39f0d1e
    • Ivan Khoronzhuk's avatar
      net: ethernet: ti: cpsw: fix cpsw assignment in resume · a60ced99
      Ivan Khoronzhuk authored
      There is a copy-paste error, which hides breaking of resume
      for CPSW driver: there was replaced netdev_priv() to ndev_to_cpsw(ndev)
      in suspend, but left it unchanged in resume.
      
      Fixes: 606f3993
      (ti: cpsw: move platform data and slaves info to cpsw_common)
      Reported-by: default avatarAlexey Starikovskiy <AStarikovskiy@topcon.com>
      Signed-off-by: default avatarIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a60ced99
    • WANG Cong's avatar
      kcm: fix a null pointer dereference in kcm_sendmsg() · cd27b96b
      WANG Cong authored
      In commit 98e3862c ("kcm: fix 0-length case for kcm_sendmsg()")
      I tried to avoid skb allocation for 0-length case, but missed
      a check for NULL pointer in the non EOR case.
      
      Fixes: 98e3862c ("kcm: fix 0-length case for kcm_sendmsg()")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd27b96b
    • Rui Sousa's avatar
      net: fec: fix multicast filtering hardware setup · 01f8902b
      Rui Sousa authored
      Fix hardware setup of multicast address hash:
      - Never clear the hardware hash (to avoid packet loss)
      - Construct the hash register values in software and then write once
      to hardware
      Signed-off-by: default avatarRui Sousa <rui.sousa@nxp.com>
      Signed-off-by: default avatarFugang Duan <fugang.duan@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01f8902b
    • David S. Miller's avatar
      Merge branch 'ipv6-v4mapped' · 144adc65
      David S. Miller authored
      Jonathan T. Leighton says:
      
      ====================
      IPv4-mapped on wire, :: dst address issue
      
      Under some circumstances IPv6 datagrams are sent with IPv4-mapped IPv6
      addresses as the source. Given an IPv6 socket bound to an IPv4-mapped
      IPv6 address, and an IPv6 destination address, both TCP and UDP will
      will send packets using the IPv4-mapped IPv6 address as the source. Per
      RFC 6890 (Table 20), IPv4-mapped IPv6 source addresses are not allowed
      in an IP datagram. The problem can be observed by attempting to
      connect() either a TCP or UDP socket, or by using sendmsg() with a UDP
      socket. The patch is intended to correct this issue for all socket
      types.
      
      linux follows the BSD convention that an IPv6 destination address
      specified as in6addr_any is converted to the loopback address.
      Currently, neither TCP nor UDP consider the possibility that the source
      address is an IPv4-mapped IPv6 address, and assume that the appropriate
      loopback address is ::1. The patch adds a check on whether or not the
      source address is an IPv4-mapped IPv6 address and then sets the
      destination address to either ::ffff:127.0.0.1 or ::1, as appropriate.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      144adc65
    • Jonathan T. Leighton's avatar
      ipv6: Handle IPv4-mapped src to in6addr_any dst. · 052d2369
      Jonathan T. Leighton authored
      This patch adds a check on the type of the source address for the case
      where the destination address is in6addr_any. If the source is an
      IPv4-mapped IPv6 source address, the destination is changed to
      ::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
      is done in three locations to handle UDP calls to either connect() or
      sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
      handling an in6addr_any destination until very late, so the patch only
      needs to handle the case where the source is an IPv4-mapped IPv6
      address.
      Signed-off-by: default avatarJonathan T. Leighton <jtleight@udel.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      052d2369
    • Jonathan T. Leighton's avatar
      ipv6: Inhibit IPv4-mapped src address on the wire. · ec5e3b0a
      Jonathan T. Leighton authored
      This patch adds a check for the problematic case of an IPv4-mapped IPv6
      source address and a destination address that is neither an IPv4-mapped
      IPv6 address nor in6addr_any, and returns an appropriate error. The
      check in done before returning from looking up the route.
      Signed-off-by: default avatarJonathan T. Leighton <jtleight@udel.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec5e3b0a
    • Or Gerlitz's avatar
      net/mlx5e: Disable preemption when doing TC statistics upcall · fed06ee8
      Or Gerlitz authored
      When called by HW offloading drivers, the TC action (e.g
      net/sched/act_mirred.c) code uses this_cpu logic, e.g
      
       _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)
      
      per the kernel documention, preemption should be disabled, add that.
      
      Before the fix, when running with CONFIG_PREEMPT set, we get a
      
      BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793
      
      asserion from the TC action (mirred) stats_update callback.
      
      Fixes: aad7e08d ('net/mlx5e: Hardware offloaded flower filter statistics support')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fed06ee8
    • Linus Torvalds's avatar
      Merge tag 'media/v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 747ae0a9
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "A colorspace regression fix in V4L2 core and a CEC core bug that makes
        it discard valid messages"
      
      * tag 'media/v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] cec: initiator should be the same as the destination for, poll
        [media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGB
      747ae0a9
    • David S. Miller's avatar
      Merge branch 'rhashtable-allocation-failure-during-insertion' · 0c8ef291
      David S. Miller authored
      Herbert Xu says:
      
      ====================
      rhashtable: Handle table allocation failure during insertion
      
      v2 -
      
      Added Ack to patch 2.
      Fixed RCU annotation in code path executed by rehasher by using
      rht_dereference_bucket.
      
      v1 -
      
      This series tackles the problem of table allocation failures during
      insertion.  The issue is that we cannot vmalloc during insertion.
      This series deals with this by introducing nested tables.
      
      The first two patches removes manual hash table walks which cannot
      work on a nested table.
      
      The final patch introduces nested tables.
      
      I've tested this with test_rhashtable and it appears to work.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c8ef291
    • Herbert Xu's avatar
      rhashtable: Add nested tables · 40137906
      Herbert Xu authored
      This patch adds code that handles GFP_ATOMIC kmalloc failure on
      insertion.  As we cannot use vmalloc, we solve it by making our
      hash table nested.  That is, we allocate single pages at each level
      and reach our desired table size by nesting them.
      
      When a nested table is created, only a single page is allocated
      at the top-level.  Lower levels are allocated on demand during
      insertion.  Therefore for each insertion to succeed, only two
      (non-consecutive) pages are needed.
      
      After a nested table is created, a rehash will be scheduled in
      order to switch to a vmalloced table as soon as possible.  Also,
      the rehash code will never rehash into a nested table.  If we
      detect a nested table during a rehash, the rehash will be aborted
      and a new rehash will be scheduled.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40137906
    • Herbert Xu's avatar
      tipc: Fix tipc_sk_reinit race conditions · 9dbbfb0a
      Herbert Xu authored
      There are two problems with the function tipc_sk_reinit.  Firstly
      it's doing a manual walk over an rhashtable.  This is broken as
      an rhashtable can be resized and if you manually walk over it
      during a resize then you may miss entries.
      
      Secondly it's missing memory barriers as previously the code used
      spinlocks which provide the barriers implicitly.
      
      This patch fixes both problems.
      
      Fixes: 07f6c4bc ("tipc: convert tipc reference table to...")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dbbfb0a
    • Herbert Xu's avatar
      gfs2: Use rhashtable walk interface in glock_hash_walk · 6a254780
      Herbert Xu authored
      The function glock_hash_walk walks the rhashtable by hand.  This
      is broken because if it catches the hash table in the middle of
      a rehash, then it will miss entries.
      
      This patch replaces the manual walk by using the rhashtable walk
      interface.
      
      Fixes: 88ffbf3e ("GFS2: Use resizable hash table for glocks")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a254780
    • Ralf Baechle's avatar
      NET: Fix /proc/net/arp for AX.25 · 4872e57c
      Ralf Baechle authored
      When sending ARP requests over AX.25 links the hwaddress in the neighbour
      cache are not getting initialized.  For such an incomplete arp entry
      ax2asc2 will generate an empty string resulting in /proc/net/arp output
      like the following:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      172.20.1.99      0x3         0x0              *        bpq0
      
      The missing field will confuse the procfs parsing of arp(8) resulting in
      incorrect output for the device such as the following:
      
      $ arp
      Address                  HWtype  HWaddress           Flags Mask            Iface
      gateway                  ether   52:54:00:00:5d:5f   C                     ens3
      172.20.1.99                      (incomplete)                              ens3
      
      This changes the content of /proc/net/arp to:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      172.20.1.99      0x3         0x0         *                     *        bpq0
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      
      To do so it change ax2asc to put the string "*" in buf for a NULL address
      argument.  Finally the HW address field is left aligned in a 17 character
      field (the length of an ethernet HW address in the usual hex notation) for
      readability.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4872e57c
    • Mart van Santen's avatar
      xen-netback: vif counters from int/long to u64 · ebf692f8
      Mart van Santen authored
      This patch fixes an issue where the type of counters in the queue(s)
      and interface are not in sync (queue counters are int, interface
      counters are long), causing incorrect reporting of tx/rx values
      of the vif interface and unclear counter overflows.
      This patch sets both counters to the u64 type.
      Signed-off-by: default avatarMart van Santen <mart@greenhost.nl>
      Reviewed-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebf692f8