1. 08 Apr, 2019 33 commits
    • Huazhong Tan's avatar
      net: hns3: set up the vport alive state while reinitializing · cd513a69
      Huazhong Tan authored
      When reinitializing, the vport alive state needs to be set up.
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd513a69
    • Huazhong Tan's avatar
      net: hns3: set vport alive state to default while resetting · 0f14c5b1
      Huazhong Tan authored
      When resetting, the vport alive state should be set to default,
      otherwise the alive state of the vport whose driver not running
      is wrong before the timer to check it out.
      
      Fixes: a6d818e3 ("net: hns3: Add vport alive state checking support")
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f14c5b1
    • David S. Miller's avatar
      Merge branch 'ipv4-Enable-support-for-IPv6-gateway-with-IPv4-routes' · 0ed8c3dc
      David S. Miller authored
      David Ahern says:
      
      ====================
      ipv4: Enable support for IPv6 gateway with IPv4 routes
      
      Last set of three with the end goal of enabling IPv6 gateways with IPv4
      routes.
      
      This set adds fib6_nh_init and release to the IPv6 stubs, and adds neighbor
      helpers that IPv4 code invokes to resolve an IPv6 address. When using
      an IPv6 neighbor entry the hh_cache is bypassed as it contains the wrong
      ethernet header for an IPv4 packet.
      
      The nh_common nhc_has_gw was a temporary field used to convert existing
      code from fib{6}_nh to fib_nh_common. That field is now converted to
      nhc_gw_family to differentiate the address family of the gateway entry
      as opposed to the address family of the container of fib_nh_common.
      
      Existing code for rtable and fib_config is refactored to prepare
      for a v6 address and then support is added. From there various
      miscellaneous functions are updated to handle a v6 gateway - from
      validating the v6 address to lookups in bpf code to verifying the
      nexthop state.
      
      Offload drivers - mlxsw and rocker - are modified to detect the v6
      gateway and reject the route as 'unsupported'. e.g.,
      
          $ ip ro add 172.16.101.0/24 via inet6 fe80::202:ff:fe00:b dev swp1s0
          Error: mlxsw_spectrum: IPv6 gateway with IPv4 route is not supported.
      
      This can be removed in time once support is added to each.
      
      With the infrastructure changes in place, patch 17 enables it by adding
      support for RTA_VIA to IPv4. RTA_VIA can be used for IPv4 addresses as
      well. Only one of RTA_VIA and RTA_GATEWAY can be passed in a request.
      
      Patch 18 adds a few test cases to fib_tests.sh.
      
      v2
      - comments from Ido - fixed typos as noted and updated messages
      - add commit message to patch 1
      - In patch 9, ipv4: Add fib_check_nh_v6_gw, moved the call to
        fib6_nh_release under the 'if (!err)' check as the intention is
        that release should not be called if init fails.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ed8c3dc
    • David Ahern's avatar
      selftests: fib_tests: Add tests for ipv6 gateway with ipv4 route · 228ddb33
      David Ahern authored
      Add tests for ipv6 gateway with ipv4 route. Tests include basic
      single path with ping to verify connectivity and multipath.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      228ddb33
    • David Ahern's avatar
      ipv4: Allow ipv6 gateway with ipv4 routes · d1566268
      David Ahern authored
      Add support for RTA_VIA and allow an IPv6 nexthop for v4 routes:
         $ ip ro add 172.16.1.0/24 via inet6 2001:db8::1 dev eth0
         $ ip ro ls
         ...
         172.16.1.0/24 via inet6 2001:db8::1 dev eth0
      
      For convenience and simplicity, userspace can use RTA_VIA to specify
      AF_INET or AF_INET6 gateway.
      
      The common fib_nexthop_info dump function compares the gateway address
      family to the nh_common family to know if the gateway should be encoded
      as RTA_VIA or RTA_GATEWAY.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1566268
    • David Ahern's avatar
      ipv4: Flag fib_info with a fib_nh using IPv6 gateway · 19a9d136
      David Ahern authored
      Until support is added to the offload drivers, they need to be able to
      reject routes with an IPv6 gateway. To that end add a flag to fib_info
      that indicates if any fib_nh has a v6 gateway. The flag allows the drivers
      to efficiently know the use of a v6 gateway without walking all fib_nh
      tied to a fib_info each time a route is added.
      
      Update mlxsw and rocker to reject the routes with extack message as to why.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19a9d136
    • David Ahern's avatar
      ipv4: Handle ipv6 gateway in fib_good_nh · 1a38c43d
      David Ahern authored
      Update fib_good_nh to handle an ipv6 gateway.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a38c43d
    • David Ahern's avatar
      ipv4: Handle ipv6 gateway in fib_detect_death · 619d1826
      David Ahern authored
      Update fib_detect_death to handle an ipv6 gateway.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      619d1826
    • David Ahern's avatar
      ipv4: Handle ipv6 gateway in ipv4_confirm_neigh · 6de9c055
      David Ahern authored
      Update ipv4_confirm_neigh to handle an ipv6 gateway.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6de9c055
    • David Ahern's avatar
      bpf: Handle ipv6 gateway in bpf_ipv4_fib_lookup · 6f5f68d0
      David Ahern authored
      Update bpf_ipv4_fib_lookup to handle an ipv6 gateway.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f5f68d0
    • David Ahern's avatar
      ipv4: Add helpers for neigh lookup for nexthop · 5c9f7c1d
      David Ahern authored
      A common theme in the output path is looking up a neigh entry for a
      nexthop, either the gateway in an rtable or a fallback to the daddr
      in the skb:
      
              nexthop = (__force u32)rt_nexthop(rt, ip_hdr(skb)->daddr);
              neigh = __ipv4_neigh_lookup_noref(dev, nexthop);
              if (unlikely(!neigh))
                      neigh = __neigh_create(&arp_tbl, &nexthop, dev, false);
      
      To allow the nexthop to be an IPv6 address we need to consider the
      family of the nexthop and then call __ipv{4,6}_neigh_lookup_noref based
      on it.
      
      To make this simpler, add a ip_neigh_gw4 helper similar to ip_neigh_gw6
      added in an earlier patch which handles:
      
              neigh = __ipv4_neigh_lookup_noref(dev, nexthop);
              if (unlikely(!neigh))
                      neigh = __neigh_create(&arp_tbl, &nexthop, dev, false);
      
      And then add a second one, ip_neigh_for_gw, that calls either
      ip_neigh_gw4 or ip_neigh_gw6 based on the address family of the gateway.
      
      Update the output paths in the VRF driver and core v4 code to use
      ip_neigh_for_gw simplifying the family based lookup and making both
      ready for a v6 nexthop.
      
      ipv4_neigh_lookup has a different need - the potential to resolve a
      passed in address in addition to any gateway in the rtable or skb. Since
      this is a one-off, add ip_neigh_gw4 and ip_neigh_gw6 diectly. The
      difference between __neigh_create used by the helpers and neigh_create
      called by ipv4_neigh_lookup is taking a refcount, so add rcu_read_lock_bh
      and bump the refcnt on the neigh entry.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c9f7c1d
    • David Ahern's avatar
      neighbor: Add skip_cache argument to neigh_output · 0353f282
      David Ahern authored
      A later patch allows an IPv6 gateway with an IPv4 route. The neighbor
      entry will exist in the v6 ndisc table and the cached header will contain
      the ipv6 protocol which is wrong for an IPv4 packet. For an IPv4 packet to
      use the v6 neighbor entry, neigh_output needs to skip the cached header
      and just use the output callback for the neigh entry.
      
      A future patchset can look at expanding the hh_cache to handle 2
      protocols. For now, IPv6 gateways with an IPv4 route will take the
      extra overhead of generating the header.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0353f282
    • David Ahern's avatar
      ipv4: Add fib_check_nh_v6_gw · 717a8f5b
      David Ahern authored
      Add helper to use fib6_nh_init to validate a nexthop spec with an IPv6
      gateway.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      717a8f5b
    • David Ahern's avatar
      ipv4: Refactor fib_check_nh · 448d7248
      David Ahern authored
      fib_check_nh is currently huge covering multiple uses cases - device only,
      device + gateway, and device + gateway with ONLINK. The next patch adds
      validation checks for IPv6 which only further complicates it. So, break
      fib_check_nh into 2 helpers - one for gateway validation and one for device
      only.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      448d7248
    • David Ahern's avatar
      ipv4: Add support to fib_config for IPv6 gateway · a4ea5d43
      David Ahern authored
      Add support for an IPv6 gateway to fib_config. Since a gateway is either
      IPv4 or IPv6, make it a union with fc_gw4 where fc_gw_family decides
      which address is in use. Update current checks on family and gw4 to
      handle ipv6 as well.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4ea5d43
    • David Ahern's avatar
      ipv4: Add support to rtable for ipv6 gateway · 0f5f7d7b
      David Ahern authored
      Add support for an IPv6 gateway to rtable. Since a gateway is either
      IPv4 or IPv6, make it a union with rt_gw4 where rt_gw_family decides
      which address is in use.
      
      When dumping the route data, encode an ipv6 nexthop using RTA_VIA.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f5f7d7b
    • David Ahern's avatar
      ipv4: Prepare fib_config for IPv6 gateway · f35b794b
      David Ahern authored
      Similar to rtable, fib_config needs to allow the gateway to be either an
      IPv4 or an IPv6 address. To that end, rename fc_gw to fc_gw4 to mean an
      IPv4 address and add fc_gw_family. Checks on 'is a gateway set' are changed
      to see if fc_gw_family is set. In the process prepare the code for a
      fc_gw_family == AF_INET6.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f35b794b
    • David Ahern's avatar
      ipv4: Prepare rtable for IPv6 gateway · 1550c171
      David Ahern authored
      To allow the gateway to be either an IPv4 or IPv6 address, remove
      rt_uses_gateway from rtable and replace with rt_gw_family. If
      rt_gw_family is set it implies rt_uses_gateway. Rename rt_gateway
      to rt_gw4 to represent the IPv4 version.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1550c171
    • David Ahern's avatar
      net: Replace nhc_has_gw with nhc_gw_family · bdf00467
      David Ahern authored
      Allow the gateway in a fib_nh_common to be from a different address
      family than the outer fib{6}_nh. To that end, replace nhc_has_gw with
      nhc_gw_family and update users of nhc_has_gw to check nhc_gw_family.
      Now nhc_family is used to know if the nh_common is part of a fib_nh
      or fib6_nh (used for container_of to get to route family specific data),
      and nhc_gw_family represents the address family for the gateway.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdf00467
    • David Ahern's avatar
      ipv6: Add neighbor helpers that use the ipv6 stub · 71df5777
      David Ahern authored
      Add ipv6 helpers to handle ndisc references via the stub. Update
      bpf_ipv6_fib_lookup to use __ipv6_neigh_lookup_noref_stub instead of
      the open code ___neigh_lookup_noref with the stub.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71df5777
    • David Ahern's avatar
      ipv6: Add fib6_nh_init and release to stubs · 1aefd3de
      David Ahern authored
      Add fib6_nh_init and fib6_nh_release to ipv6_stubs. If fib6_nh_init fails,
      callers should not invoke fib6_nh_release, so there is no reason to have
      a dummy stub for the IPv6 is not enabled case.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1aefd3de
    • Heiner Kallweit's avatar
      net: phy: improve link partner capability detection · 3b8b11f9
      Heiner Kallweit authored
      genphy_read_status() so far checks phydev->supported, not the actual
      PHY capabilities. This can make a difference if the supported speeds
      have been limited by of_set_phy_supported() or phy_set_max_speed().
      
      It seems that this issue only affects the link partner advertisements
      as displayed by ethtool. Also this patch wouldn't apply to older
      kernels because linkmode bitmaps have been introduced recently.
      Therefore net-next.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b8b11f9
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2019-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 8bb309e6
      David S. Miller authored
      Saeed Mamameed says:
      
      ====================
      mlx5-updates-2019-04-02
      
      This series provides misc updates to mlx5 driver
      
      1) Aya Levin (1): Handle event of power detection in the PCIE slot
      
      2) Eli Britstein (6):
        Some TC VLAN related updates and fixes to the previous VLAN modify action
        support patchset.
        Offload TC e-switch rules with egress/ingress VLAN devices
      
      3) Max Gurtovoy (1): Fix double mutex initialization in esiwtch.c
      
      4) Tariq Toukan (3): Misc small updates
        A write memory barrier is sufficient in EQ ci update
        Obsolete param field holding a constant value
        Unify logic of MTU boundaries
      
      5) Tonghao Zhang (4): Misc updates to en_tc.c
        Make the log friendly when decapsulation offload not supported
        Remove 'parse_attr' argument in parse_tc_fdb_actions()
        Deletes unnecessary setting of esw_attr->parse_attr
        Return -EOPNOTSUPP when attempting to offload an unsupported action
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8bb309e6
    • Vishal Kulkarni's avatar
      cxgb4: Don't return EAGAIN when TCAM is full. · ed514fc5
      Vishal Kulkarni authored
      During hash filter programming, driver needs to return ENOSPC error
      intead of EAGAIN when TCAM is full.
      Signed-off-by: default avatarVishal Kulkarni <vishal@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed514fc5
    • Alexandru Ardelean's avatar
      net: xilinx: emaclite: add minimal ndo_do_ioctl hook · fcf97825
      Alexandru Ardelean authored
      This hook only implements a minimal set of ioctl hooks to be able to access
      MII regs by using phytool.
      When using this simple MAC controller, it's pretty difficult to do
      debugging of the PHY chip without checking MII regs.
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcf97825
    • Alexandru Ardelean's avatar
      net: xilinx: emaclite: add minimal ethtool ops · 9a80ba06
      Alexandru Ardelean authored
      This set adds a minimal set of ethtool hooks to the driver, which provide a
      decent amount of link information via ethtool.
      With this change, running `ethtool ethX` in user-space provides all the
      neatly-formatted information about the link (what was negotiated, what is
      advertised, etc).
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a80ba06
    • Paolo Abeni's avatar
      datagram: remove rendundant 'peeked' argument · fd69c399
      Paolo Abeni authored
      After commit a297569f ("net/udp: do not touch skb->peeked unless
      really needed") the 'peeked' argument of __skb_try_recv_datagram()
      and friends is always equal to !!'flags & MSG_PEEK'.
      
      Since such argument is really a boolean info, and the callers have
      already 'flags & MSG_PEEK' handy, we can remove it and clean-up the
      code a bit.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd69c399
    • Vlad Buslov's avatar
      net: sched: flower: insert filter to ht before offloading it to hw · 1f17f774
      Vlad Buslov authored
      John reports:
      
      Recent refactoring of fl_change aims to use the classifier spinlock to
      avoid the need for rtnl lock. In doing so, the fl_hw_replace_filer()
      function was moved to before the lock is taken. This can create problems
      for drivers if duplicate filters are created (commmon in ovs tc offload
      due to filters being triggered by user-space matches).
      
      Drivers registered for such filters will now receive multiple copies of
      the same rule, each with a different cookie value. This means that the
      drivers would need to do a full match field lookup to determine
      duplicates, repeating work that will happen in flower __fl_lookup().
      Currently, drivers do not expect to receive duplicate filters.
      
      To fix this, verify that filter with same key is not present in flower
      classifier hash table and insert the new filter to the flower hash table
      before offloading it to hardware. Implement helper function
      fl_ht_insert_unique() to atomically verify/insert a filter.
      
      This change makes filter visible to fast path at the beginning of
      fl_change() function, which means it can no longer be freed directly in
      case of error. Refactor fl_change() error handling code to deallocate the
      filter with rcu timeout.
      
      Fixes: 620da486 ("net: sched: flower: refactor fl_change")
      Reported-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f17f774
    • David S. Miller's avatar
      Merge branch 'rhashtable-bitlocks' · 9186c90b
      David S. Miller authored
      NeilBrown says:
      
      ====================
      Convert rhashtable to use bitlocks
      
      This series converts rhashtable to use a per-bucket bitlock
      rather than a separate array of spinlocks.
      This:
        reduces memory usage
        results in slightly fewer memory accesses
        slightly improves parallelism
        makes a configuration option unnecessary
      
      The main change from previous version is to use a distinct type for
      the pointer in the bucket which has a bit-lock in it.  This
      helped find two places where rht_ptr() was missed, one
      in  rhashtable_free_and_destroy() in print_ht in the test code.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9186c90b
    • NeilBrown's avatar
      rhashtable: add lockdep tracking to bucket bit-spin-locks. · 149212f0
      NeilBrown authored
      Native bit_spin_locks are not tracked by lockdep.
      
      The bit_spin_locks used for rhashtable buckets are local
      to the rhashtable implementation, so there is little opportunity
      for the sort of misuse that lockdep might detect.
      However locks are held while a hash function or compare
      function is called, and if one of these took a lock,
      a misbehaviour is possible.
      
      As it is quite easy to add lockdep support this unlikely
      possibility seems to be enough justification.
      
      So create a lockdep class for bucket bit_spin_lock and attach
      through a lockdep_map in each bucket_table.
      
      Without the 'nested' annotation in rhashtable_rehash_one(), lockdep
      correctly reports a possible problem as this lock is taken
      while another bucket lock (in another table) is held.  This
      confirms that the added support works.
      With the correct nested annotation in place, lockdep reports
      no problems.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      149212f0
    • NeilBrown's avatar
      rhashtable: use bit_spin_locks to protect hash bucket. · 8f0db018
      NeilBrown authored
      This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
      bucket pointer to lock the hash chain for that bucket.
      
      The benefits of a bit spin_lock are:
       - no need to allocate a separate array of locks.
       - no need to have a configuration option to guide the
         choice of the size of this array
       - locking cost is often a single test-and-set in a cache line
         that will have to be loaded anyway.  When inserting at, or removing
         from, the head of the chain, the unlock is free - writing the new
         address in the bucket head implicitly clears the lock bit.
         For __rhashtable_insert_fast() we ensure this always happens
         when adding a new key.
       - even when lockings costs 2 updates (lock and unlock), they are
         in a cacheline that needs to be read anyway.
      
      The cost of using a bit spin_lock is a little bit of code complexity,
      which I think is quite manageable.
      
      Bit spin_locks are sometimes inappropriate because they are not fair -
      if multiple CPUs repeatedly contend of the same lock, one CPU can
      easily be starved.  This is not a credible situation with rhashtable.
      Multiple CPUs may want to repeatedly add or remove objects, but they
      will typically do so at different buckets, so they will attempt to
      acquire different locks.
      
      As we have more bit-locks than we previously had spinlocks (by at
      least a factor of two) we can expect slightly less contention to
      go with the slightly better cache behavior and reduced memory
      consumption.
      
      To enhance type checking, a new struct is introduced to represent the
        pointer plus lock-bit
      that is stored in the bucket-table.  This is "struct rhash_lock_head"
      and is empty.  A pointer to this needs to be cast to either an
      unsigned lock, or a "struct rhash_head *" to be useful.
      Variables of this type are most often called "bkt".
      
      Previously "pprev" would sometimes point to a bucket, and sometimes a
      ->next pointer in an rhash_head.  As these are now different types,
      pprev is NULL when it would have pointed to the bucket. In that case,
      'blk' is used, together with correct locking protocol.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f0db018
    • NeilBrown's avatar
      rhashtable: allow rht_bucket_var to return NULL. · ff302db9
      NeilBrown authored
      Rather than returning a pointer to a static nulls, rht_bucket_var()
      now returns NULL if the bucket doesn't exist.
      This will make the next patch, which stores a bitlock in the
      bucket pointer, somewhat cleaner.
      
      This change involves introducing __rht_bucket_nested() which is
      like rht_bucket_nested(), but doesn't provide the static nulls,
      and changing rht_bucket_nested() to call this and possible
      provide a static nulls - as is still needed for the non-var case.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff302db9
    • NeilBrown's avatar
      rhashtable: use cmpxchg() in nested_table_alloc() · 7a41c294
      NeilBrown authored
      nested_table_alloc() relies on the fact that there is
      at most one spinlock allocated for every slot in the top
      level nested table, so it is not possible for two threads
      to try to allocate the same table at the same time.
      
      This assumption is a little fragile (it is not explicit) and is
      unnecessary as cmpxchg() can be used instead.
      
      A future patch will replace the spinlocks by per-bucket bitlocks,
      and then we won't be able to protect the slot pointer with a spinlock.
      
      So replace rcu_assign_pointer() with cmpxchg() - which has equivalent
      barrier properties.
      If it the cmp fails, free the table that was just allocated.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a41c294
  2. 07 Apr, 2019 7 commits