1. 05 Jun, 2017 40 commits
    • David Howells's avatar
      rxrpc: Add service upgrade support for client connections · 4e255721
      David Howells authored
      Make it possible for a client to use AuriStor's service upgrade facility.
      
      The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
      the first sendmsg() of a call.  This takes no parameters.
      
      When recvmsg() starts returning data from the call, the service ID field in
      the returned msg_name will reflect the result of the upgrade attempt.  If
      the upgrade was ignored, srx_service will match what was set in the
      sendmsg(); if the upgrade happened the srx_service will be altered to
      indicate the service the server upgraded to.
      
      Note that:
      
       (1) The choice of upgrade service is up to the server
      
       (2) Further client calls to the same server that would share a connection
           are blocked if an upgrade probe is in progress.
      
       (3) This should only be used to probe the service.  Clients should then
           use the returned service ID in all subsequent communications with that
           server (and not set the upgrade).  Note that the kernel will not
           retain this information should the connection expire from its cache.
      
       (4) If a server that supports upgrading is replaced by one that doesn't,
           whilst a connection is live, and if the replacement is running, say,
           OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
           server will not respond to packets sent to the upgraded connection.
      
           At this point, calls will time out and the server must be reprobed.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4e255721
    • David Howells's avatar
      rxrpc: Implement service upgrade · 4722974d
      David Howells authored
      Implement AuriStor's service upgrade facility.  There are three problems
      that this is meant to deal with:
      
       (1) Various of the standard AFS RPC calls have IPv4 addresses in their
           requests and/or replies - but there's no room for including IPv6
           addresses.
      
       (2) Definition of IPv6-specific RPC operations in the standard operation
           sets has not yet been achieved.
      
       (3) One could envision the creation a new service on the same port that as
           the original service.  The new service could implement improved
           operations - and the client could try this first, falling back to the
           original service if it's not there.
      
           Unfortunately, certain servers ignore packets addressed to a service
           they don't implement and don't respond in any way - not even with an
           ABORT.  This means that the client must then wait for the call timeout
           to occur.
      
      What service upgrade does is to see if the connection is marked as being
      'upgradeable' and if so, change the service ID in the server and thus the
      request and reply formats.  Note that the upgrade isn't mandatory - a
      server that supports only the original call set will ignore the upgrade
      request.
      
      In the protocol, the procedure is then as follows:
      
       (1) To request an upgrade, the first DATA packet in a new connection must
           have the userStatus set to 1 (this is normally 0).  The userStatus
           value is normally ignored by the server.
      
       (2) If the server doesn't support upgrading, the reply packets will
           contain the same service ID as for the first request packet.
      
       (3) If the server does support upgrading, all future reply packets on that
           connection will contain the new service ID and the new service ID will
           be applied to *all* further calls on that connection as well.
      
       (4) The RPC op used to probe the upgrade must take the same request data
           as the shadow call in the upgrade set (but may return a different
           reply).  GetCapability RPC ops were added to all standard sets for
           just this purpose.  Ops where the request formats differ cannot be
           used for probing.
      
       (5) The client must wait for completion of the probe before sending any
           further RPC ops to the same destination.  It should then use the
           service ID that recvmsg() reported back in all future calls.
      
       (6) The shadow service must have call definitions for all the operation
           IDs defined by the original service.
      
      
      To support service upgrading, a server should:
      
       (1) Call bind() twice on its AF_RXRPC socket before calling listen().
           Each bind() should supply a different service ID, but the transport
           addresses must be the same.  This allows the server to receive
           requests with either service ID.
      
       (2) Enable automatic upgrading by calling setsockopt(), specifying
           RXRPC_UPGRADEABLE_SERVICE and passing in a two-member array of
           unsigned shorts as the argument:
      
      	unsigned short optval[2];
      
           This specifies a pair of service IDs.  They must be different and must
           match the service IDs bound to the socket.  Member 0 is the service ID
           to upgrade from and member 1 is the service ID to upgrade to.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4722974d
    • David Howells's avatar
      rxrpc: Permit multiple service binding · 28036f44
      David Howells authored
      Permit bind() to be called on an AF_RXRPC socket more than once (currently
      maximum twice) to bind multiple listening services to it.  There are some
      restrictions:
      
       (1) All bind() calls involved must have a non-zero service ID.
      
       (2) The service IDs must all be different.
      
       (3) The rest of the address (notably the transport part) must be the same
           in all (a single UDP socket is shared).
      
       (4) This must be done before listen() or sendmsg() is called.
      
      This allows someone to connect to the service socket with different service
      IDs and lays the foundation for service upgrading.
      
      The service ID used by an incoming call can be extracted from the msg_name
      returned by recvmsg().
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      28036f44
    • David Howells's avatar
      rxrpc: Separate the connection's protocol service ID from the lookup ID · 68d6d1ae
      David Howells authored
      Keep the rxrpc_connection struct's idea of the service ID that is exposed
      in the protocol separate from the service ID that's used as a lookup key.
      
      This allows the protocol service ID on a client connection to get upgraded
      without making the connection unfindable for other client calls that also
      would like to use the upgraded connection.
      
      The connection's actual service ID is then returned through recvmsg() by
      way of msg_name.
      
      Whilst we're at it, we get rid of the last_service_id field from each
      channel.  The service ID is per-connection, not per-call and an entire
      connection is upgraded in one go.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      68d6d1ae
    • David S. Miller's avatar
      Merge branch 'mlxsw-Minor-cleanup' · aae1a2ce
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      mlxsw: Minor cleanup
      
      Fix small issues I noticed during the refactoring.
      
      First patch adds file name comments in the header file to make it clear
      what goes where. Second patch fixes a typo and third patch simply aligns
      RIF index allocation with similar allocations in the driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aae1a2ce
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Align RIF index allocation with existing code · de5ed99e
      Ido Schimmel authored
      The way we usually allocate an index is by letting the allocation
      function return an error instead of an invalid index.
      
      Do the same for RIF index.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de5ed99e
    • Ido Schimmel's avatar
      da0abcf9
    • Ido Schimmel's avatar
      mlxsw: spectrum: Tidy up header file · cb4cc0e0
      Ido Schimmel authored
      Make it clear where functions are defined and move misplaced declaration
      to their correct place.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb4cc0e0
    • Yotam Gigi's avatar
      mlxsw: spectrum: Rename the firmware file · a4e1ce24
      Yotam Gigi authored
      Change the firmware file name to be in "mellanox" directory.
      
      This commit is a followup to the linux-firmware commit a4c72696f5f4
      ("Mellanox: Add firmware for mlxsw_spectrum")
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4e1ce24
    • David S. Miller's avatar
      Merge branch 'qed-vf-xdp' · fc85c910
      David S. Miller authored
      Yuval Mintz says:
      
      qed*: Support VF XDP attachment
      
      ====================
      Each driver queue [Rx, Tx, XDP-forwarding] requires an allocated HW/FW
      connection + configured queue-zone.
      
      VF handling by the PF has several limitations that prevented adding the
      capability to perform XDP at driver-level:
      
       - The VF assumes there's 1-to-1 correspondance between the VF queue and
         the used connection, meaning q<x> is always going to use cid<x>,
         whereas for its own queues the PF is acquiring a new cid per each new
         queue.
      
       - There's a 1-to-1 correspondate between the VF-queues and the HW queue
         zones. While this is necessary for Rx-queues [as the queue-zone
         contains the producer], transmission queues can share the underlaying
         queue-zone [only shared configuration is coalescing].
         But all VF<->PF communication mechanisms assume there's a single
         identifier that identify a queue [as queue-zone == queue], while
         sharing queue-zones requires passing additional information.
      
       - VFs currently don't try mapping a doorbell bar - there's a small
         doorbell window in the regview allowing VFs to doorbell up to 16
         connections; but this window isn's wide enough for the added XDP
         forwarding queues.
      
      This series is going to add the necessary infrastrucutre to finally let
      our VFs support XDP assuming both the PF and VF drivers are sufficiently
      new [Legacy support would be retained both for older VFs and older PFs,
      but both will be needed for this new support to work].
      Basically, the various database driver maintains for its queue-cids
      would be revised, and queue-cids would be identified using the
      (queue-zone, unique index) pair. The TLV mechanism would then be
      extended to allow VFs to communicate that unique-index as well as the
      already provided queue-zone. Finally, the VFs would try to map their
      doorbell bar and inform their PF that they're using it.
      
      Almost all the changes are in qed, with exception of #3 [which does some
      cleanup in qede as well] and #11 that actually enables the feature.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc85c910
    • Mintz, Yuval's avatar
      qede: VF XDP support · e7b80dec
      Mintz, Yuval authored
      This introduces 2 changes needed for XDP to be supported for VFs:
      
       a. On VF-side, publish the NDO based on qed outputs
      
       b. On PF-side, request qed to allocate sufficient cids per-VF
          to allow the child vfs to support it
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7b80dec
    • Mintz, Yuval's avatar
      qed: VF XDP support · cbb8a12c
      Mintz, Yuval authored
      The final addition on the qed front -
       - VFs would now require their PFs to provide multiple CIDs
       - Based on the availability of connections from PF, determine whether
         XDP is feasible and share it with qede via dev_info.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbb8a12c
    • Mintz, Yuval's avatar
      qed: VFs to try utilizing the doorbell bar · 1a850bfc
      Mintz, Yuval authored
      VFs are currently not mapping their doorbell bar, instead relying
      on the small doorbell window they have in their limited regview bar.
      
      In order to increase the number of possible Tx connections [queues]
      employeed by VF past 16, we need to start using the doorbell bar if
      one such is exposed - VF would communicate this fact to PF which would
      return the size-bar internally configured into chip, according to
      which the VF would decide whether to actually utilize the doorbell
      bar.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a850bfc
    • Mintz, Yuval's avatar
      qed: Multiple qzone queues for VFs · 08bc8f15
      Mintz, Yuval authored
      This adds the infrastructure for supporting VFs that want to open
      multiple transmission queues on the same queue-zone.
      At this point, there are no VFs that actually request this functionality,
      but later patches would remedy that.
      
       a. VF and PF would communicate the capability during ACQUIRE;
          Legacy VFs would continue on behaving as they do today
      
       b. PF would communicate number of supported CIDs to the VF
          and would enforce said limitation
      
       c. Whenever VF passes a request for a given queue configuration
          it would also pass an associated index within said queue-zone
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08bc8f15
    • Mintz, Yuval's avatar
      qed: IOV db support multiple queues per qzone · 007bc371
      Mintz, Yuval authored
      Allow the infrastructure a PF maintains for each one of its VFs
      to support multiple queue-cids on a single queue-zone.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      007bc371
    • Mintz, Yuval's avatar
      qed: Make VF legacy a bitfield · 3b19f478
      Mintz, Yuval authored
      Until now we used to have a single VF legacy compatibility mode,
      one that affected the place of the Rx producers of those VFs [mostly].
      
      As PF would soon support allocating CIDs for VFs instead of having
      a static CID<->queue configuration for them, we'll need to have
      an additional legacy mode since existing VFs would need to continue
      on using the older mode of operation.
      
      Change the infrastrucutre so that the legacy would be able to indicate
      which of the legacy behaviors is needed for a given VF.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b19f478
    • Mintz, Yuval's avatar
      qed: Assign a unique per-queue index to queue-cid · bbe3f233
      Mintz, Yuval authored
      When a queue-cid is allocated, assign an index inside that's
      CID's queue-zone.
      
      For PFs and VFS, this number is going to be unique and derive
      from a per-queue-zone bitmap, while for PF's VFs queues the
      number is currently going to constant; Later, we'd add the
      capability of a VF to communicate such an index to its PF.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbe3f233
    • Mintz, Yuval's avatar
      qed: Pass vf_params when creating a queue-cid · 3946497a
      Mintz, Yuval authored
      We're going to need additional information for queue-cids
      that a PF creates for its VFs, so start by refactoring existing
      logic used for initializing said struct into receiving a structure
      encapsulating the VF-specific information that needs to be provided.
      
      This also introduces QED_QUEUE_CID_SELF - each queue-cid would hold
      an indication to whether it belongs to the hw-function holding it
      [whether that's a PF or a VF], or else what's the VF id it belongs
      to.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3946497a
    • Mintz, Yuval's avatar
      qed*: L2 interface to use the SB structures directly · f604b17d
      Mintz, Yuval authored
      Part of an effort of a cleaner seperation between qed and the protocol
      drivers, the L2 interface is to use the SB structure for initialization
      purposes opaquely.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f604b17d
    • Mintz, Yuval's avatar
      qed: Create L2 queue database · 0db711bb
      Mintz, Yuval authored
      First step in allowing a single PF/VF to open multiple queues on
      the same queue zone is to add per-hwfn database of queue-cids
      as a two-dimensional array where entry would be according to
      [queue zone][internal index].
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0db711bb
    • Mintz, Yuval's avatar
      qed: Add bitmaps for VF CIDs · 6bea61da
      Mintz, Yuval authored
      Each PF has a bitmap for its own ranges of CIDs, to allow easy grabbing
      of an available CID when such is needed. But VFs are not using the same
      mechanism, instead relying on hard-coded CIDs [ queue-index == cid ].
      
      As an infrastructure step toward increasing number of CIDs of VFs,
      the PF is going to maintain bitmaps for the VF CIDs as well -
      the bitmaps would be per-VF and the ranges would be the same [in HW all
      VFs of a given PF have the same mapping of CIDs, and the HW is capable
      of distinguishing between those according to the VF index]
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6bea61da
    • David S. Miller's avatar
      Merge branch 'skb-sgvec-overflow' · a619cc8b
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      net: Avoiding stack overflow in skb_to_sgvec
      
      The recent bug with macsec and historical one with virtio have
      indicated that letting skb_to_sgvec trounce all over an sglist
      without checking the length is probably a bad idea. And it's not
      necessary either: an sglist already explicitly marks its last
      item, and the initialization functions are diligent in doing so.
      Thus there's a clear way of avoiding future overflows.
      
      So, this patchset, from a high level, makes skb_to_sgvec return
      a potential error code, and then adjusts all callers to check
      for the error code. There are two situations in which skb_to_sgvec
      might return such an error:
      
         1) When the passed in sglist is too small; and
         2) When the passed in skbuff is too deeply nested.
      
      So, the first patch in this series handles the issues with
      skb_to_sgvec directly, and the remaining ones then handle the call
      sites.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a619cc8b
    • Jason A. Donenfeld's avatar
      virtio_net: check return value of skb_to_sgvec always · e2fcad58
      Jason A. Donenfeld authored
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Reviewed-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2fcad58
    • Jason A. Donenfeld's avatar
      macsec: check return value of skb_to_sgvec always · cda7ea69
      Jason A. Donenfeld authored
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Sabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cda7ea69
    • Jason A. Donenfeld's avatar
    • Jason A. Donenfeld's avatar
      ipsec: check return value of skb_to_sgvec always · 3f297707
      Jason A. Donenfeld authored
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f297707
    • Jason A. Donenfeld's avatar
      skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow · 48a1df65
      Jason A. Donenfeld authored
      This is a defense-in-depth measure in response to bugs like
      4d6fa57b ("macsec: avoid heap overflow in skb_to_sgvec"). There's
      not only a potential overflow of sglist items, but also a stack overflow
      potential, so we fix this by limiting the amount of recursion this function
      is allowed to do. Not actually providing a bounded base case is a future
      disaster that we can easily avoid here.
      
      As a small matter of house keeping, we take this opportunity to move the
      documentation comment over the actual function the documentation is for.
      
      While this could be implemented by using an explicit stack of skbuffs,
      when implementing this, the function complexity increased considerably,
      and I don't think such complexity and bloat is actually worth it. So,
      instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS,
      and measured the stack usage there. I also reverted the recent MIPS
      changes that give it a separate IRQ stack, so that I could experience
      some worst-case situations. I found that limiting it to 24 layers deep
      yielded a good stack usage with room for safety, as well as being much
      deeper than any driver actually ever creates.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Sabrina Dubroca <sd@queasysnail.net>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48a1df65
    • David S. Miller's avatar
      Merge branch 'bpf-Add-BPF-support-to-all-perf_event' · a11227dc
      David S. Miller authored
      Merge branch 'bpf-Add-BPF-support-to-all-perf_event'
      
      Alexei Starovoitov says:
      
      ====================
      bpf: Add BPF support to all perf_event
      
      v3->v4: one more tweak to reject unsupported events at map
      update time as Peter suggested
      
      v2->v3: more refactoring to address Peter's feedback.
      Now all perf_events are attachable and readable
      
      v1->v2: address Peter's feedback. Refactor patch 1 to allow attaching
      bpf programs to all event types and reading counters from all of them as well
      patch 2 - more tests
      patch 3 - address Dave's feedback and document bpf_perf_event_read()
      and bpf_perf_event_output() properly
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a11227dc
    • Teng Qin's avatar
      bpf: update perf event helper functions documentation · b7d3ed5b
      Teng Qin authored
      This commit updates documentation of the bpf_perf_event_output and
      bpf_perf_event_read helpers to match their implementation.
      Signed-off-by: default avatarTeng Qin <qinteng@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7d3ed5b
    • Teng Qin's avatar
      samples/bpf: add tests for more perf event types · 41e9a804
      Teng Qin authored
      $ trace_event
      
      tests attaching BPF program to HW_CPU_CYCLES, SW_CPU_CLOCK, HW_CACHE_L1D and other events.
      It runs 'dd' in the background while bpf program collects user and kernel
      stack trace on counter overflow.
      User space expects to see sys_read and sys_write in the kernel stack.
      
      $ tracex6
      
      tests reading of various perf counters from BPF program.
      
      Both tests were refactored to increase coverage and be more accurate.
      Signed-off-by: default avatarTeng Qin <qinteng@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41e9a804
    • Alexei Starovoitov's avatar
      perf, bpf: Add BPF support to all perf_event types · f91840a3
      Alexei Starovoitov authored
      Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all
      perf_event types, including HW_CACHE, RAW, and dynamic pmu events.
      Only tracepoint/kprobe events are treated differently which require
      BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly.
      
      Also add support for reading all event counters using
      bpf_perf_event_read() helper.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f91840a3
    • Sowmini Varadhan's avatar
      neigh: Really delete an arp/neigh entry on "ip neigh delete" or "arp -d" · 5071034e
      Sowmini Varadhan authored
      The command
        # arp -s 62.2.0.1 a:b:c:d:e:f dev eth2
      adds an entry like the following (listed by "arp -an")
        ? (62.2.0.1) at 0a:0b:0c:0d:0e:0f [ether] PERM on eth2
      but the symmetric deletion command
        # arp -i eth2 -d 62.2.0.1
      does not remove the PERM entry from the table, and instead leaves behind
        ? (62.2.0.1) at <incomplete> on eth2
      
      The reason is that there is a refcnt of 1 for the arp_tbl itself
      (neigh_alloc starts off the entry with a refcnt of 1), thus
      the neigh_release() call from arp_invalidate() will (at best) just
      decrement the ref to 1, but will never actually free it from the
      table.
      
      To fix this, we need to do something like neigh_forced_gc: if
      the refcnt is 1 (i.e., on the table's ref), remove the entry from
      the table and free it. This patch refactors and shares common code
      between neigh_forced_gc and the newly added neigh_remove_one.
      
      A similar issue exists for IPv6 Neighbor Cache entries, and is fixed
      in a similar manner by this patch.
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Reviewed-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5071034e
    • Andrew Lunn's avatar
      net: phy: smsc: Implement PHY statistics · 030a8902
      Andrew Lunn authored
      Most of the PHYs supported by the SMSC driver have a counter of symbol
      errors. This is 16 bit wide and wraps around when it reaches its
      maximum value.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-By: default avatarWoojung Huh <Woojung.Huh@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      030a8902
    • David S. Miller's avatar
      Merge branch 'dsa-Fixes-for-mv88e6161' · 386e2e4b
      David S. Miller authored
      Andrew Lunn says:
      
      ====================
      dsa: Fixes for mv88e6161
      
      Testing a board with an mv88e6161 turned up two issues. The PHYs were
      not found, because the wrong method to access them was used. The
      statistics did not work, because the wrong snapshot method was used
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      386e2e4b
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: mv88e6161 uses mv88e6320 stats snapshot · 0ac64c39
      Andrew Lunn authored
      The mv88e6161 was using the wrong method to perform statistics
      snapshot.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ac64c39
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: 6161 uses global 2 for PHY access · ec8378bb
      Andrew Lunn authored
      Access to the internal PHYs of the 6161 and 6123 go through global 2
      SMI registers. Fix the ops structure.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec8378bb
    • David S. Miller's avatar
      Merge branch 'dsa-mv88e6xxx-move-registers-macros' · 4e86d3cb
      David S. Miller authored
      Vivien Didelot says:
      
      ====================
      net: dsa: mv88e6xxx: move registers macros
      
      This patchset brings no functional changes.
      
      It is the first step of a cleanup renaming the chip header file and
      moving the Register definitions _as is_ in their proper header files.
      
      A following patchset will prefix them with the appropriate model
      (MV88E6XXX_ or e.g. MV88E6390_) to respect an implicit namespace and
      easily identify model subtleties in registers layout, as correctly done
      in the newly added serdes.h header.
      ====================
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e86d3cb
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: move the Global 2 macros · d23a83f2
      Vivien Didelot authored
      Move the GLOBAL2_* macros where they belong, in the related global2.h
      header.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d23a83f2
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: move the Global 1 macros · e097097b
      Vivien Didelot authored
      Move the GLOBAL_* macros where they belong, in the related global1.h
      header. Include it in global2.c which uses GLOBAL_STATUS_IRQ_DEVICE.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e097097b
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: move the Port macros · d2a160b5
      Vivien Didelot authored
      Move the PORT_* macros where they belong, in the related port.h header.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2a160b5