1. 29 Aug, 2017 22 commits
    • David S. Miller's avatar
      Merge branch 'XDP-redirect-tracepoints' · 25d4dae1
      David S. Miller authored
      Jesper Dangaard Brouer says:
      
      ====================
      XDP redirect tracepoints
      
      I feel this is as far as I can take the tracepoint infrastructure to
      assist XDP monitoring.
      
      Tracepoints comes with a base overhead of 25 nanosec for an attached
      bpf_prog, and 48 nanosec for using a full perf record. This is
      problematic for the XDP use-case, but it is very convenient to use the
      existing perf infrastructure.
      
      From a performance perspective, the real solution would be to attach
      another bpf_prog (that understand xdp_buff), but I'm not sure we want
      to introduce yet another bpf attach API for this.
      
      One thing left is to standardize the possible err return codes, to a
      limited set, to allow easier (and faster) mapping into a bpf map.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25d4dae1
    • Jesper Dangaard Brouer's avatar
      samples/bpf: xdp_monitor tool based on tracepoints · 3ffab546
      Jesper Dangaard Brouer authored
      This tool xdp_monitor demonstrate how to use the different xdp_redirect
      tracepoints xdp_redirect{,_map}{,_err} from a BPF program.
      
      The default mode is to only monitor the error counters, to avoid
      affecting the per packet performance. Tracepoints comes with a base
      overhead of 25 nanosec for an attached bpf_prog, and 48 nanosec for
      using a full perf record (with non-matching filter).  Thus, default
      loading the --stats mode could affect the maximum performance.
      
      This version of the tool is very simple and count all types of errors
      as one.  It will be natural to extend this later with the different
      types of errors that can occur, which should help users quickly
      identify common mistakes.
      
      Because the TP_STRUCT was kept in sync all the tracepoints loads the
      same BPF code.  It would also be natural to extend the map version to
      demonstrate how the map information could be used.
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ffab546
    • Jesper Dangaard Brouer's avatar
      samples/bpf: xdp_redirect load XDP dummy prog on TX device · 306da4e6
      Jesper Dangaard Brouer authored
      For supporting XDP_REDIRECT, a device driver must (obviously)
      implement the "TX" function ndo_xdp_xmit().  An additional requirement
      is you cannot TX out a device, unless it also have a xdp bpf program
      attached. This dependency is caused by the driver code need to setup
      XDP resources before it can ndo_xdp_xmit.
      
      Update bpf samples xdp_redirect and xdp_redirect_map to automatically
      attach a dummy XDP program to the configured ifindex_out device.  Use
      the XDP flag XDP_FLAGS_UPDATE_IF_NOEXIST on the dummy load, to avoid
      overriding an existing XDP prog on the device.
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      306da4e6
    • Jesper Dangaard Brouer's avatar
      xdp: separate xdp_redirect tracepoint in map case · 59a30896
      Jesper Dangaard Brouer authored
      Creating as specific xdp_redirect_map variant of the xdp tracepoints
      allow users to write simpler/faster BPF progs that get attached to
      these tracepoints.
      
      Goal is to still keep the tracepoints in xdp_redirect and xdp_redirect_map
      similar enough, that a tool can read the top part of the TP_STRUCT and
      produce similar monitor statistics.
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59a30896
    • Jesper Dangaard Brouer's avatar
      xdp: separate xdp_redirect tracepoint in error case · f5836ca5
      Jesper Dangaard Brouer authored
      There is a need to separate the xdp_redirect tracepoint into two
      tracepoints, for separating the error case from the normal forward
      case.
      
      Due to the extreme speeds XDP is operating at, loading a tracepoint
      have a measurable impact.  Single core XDP REDIRECT (ethtool tuned
      rx-usecs 25) can do 13.7 Mpps forwarding, but loading a simple
      bpf_prog at the tracepoint (with a return 0) reduce perf to 10.2 Mpps
      (CPU E5-1650 v4 @ 3.60GHz, driver: ixgbe)
      
      The overhead of loading a bpf-based tracepoint can be calculated to
      cost 25 nanosec ((1/13782002-1/10267937)*10^9 = -24.83 ns).
      
      Using perf record on the tracepoint event, with a non-matching --filter
      expression, the overhead is much larger. Performance drops to 8.3 Mpps,
      cost 48 nanosec ((1/13782002-1/8312497)*10^9 = -47.74))
      
      Having a separate tracepoint for err cases, which should be less
      frequent, allow running a continuous monitor for errors while not
      affecting the redirect forward performance (this have also been
      verified by measurements).
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5836ca5
    • Jesper Dangaard Brouer's avatar
      xdp: make xdp tracepoints report bpf prog id instead of prog_tag · b06337df
      Jesper Dangaard Brouer authored
      Given previous patch expose the map_id, it seems natural to also
      report the bpf prog id.
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b06337df
    • Jesper Dangaard Brouer's avatar
      xdp: tracepoint xdp_redirect also need a map argument · 8d3b778f
      Jesper Dangaard Brouer authored
      To make sense of the map index, the tracepoint user also need to know
      that map we are talking about.  Supply the map pointer but only expose
      the map->id.
      
      The 'to_index' is renamed 'to_ifindex'.  In the xdp_redirect_map case,
      this is the result of the devmap lookup. The map lookup key is exposed
      as map_index, which is needed to troubleshoot in case the lookup failed.
      The 'to_ifindex' is placed after 'err' to keep TP_STRUCT as common as
      possible.
      
      This also keeps the TP_STRUCT similar enough, that userspace can write
      a monitor program, that doesn't need to care about whether
      bpf_redirect or bpf_redirect_map were used.
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d3b778f
    • Jesper Dangaard Brouer's avatar
      xdp: remove redundant argument to trace_xdp_redirect · c31e5a48
      Jesper Dangaard Brouer authored
      Supplying the action argument XDP_REDIRECT to the tracepoint xdp_redirect
      is redundant as it is only called in-case this action was specified.
      
      Remove the argument, but keep "act" member of the tracepoint struct and
      populate it with XDP_REDIRECT.  This makes it easier to write a common bpf_prog
      processing events.
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c31e5a48
    • David S. Miller's avatar
      Merge tag 'rxrpc-next-20170829' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · d0fcece7
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Miscellany
      
      Here are a number of patches that make some changes/fixes and add a couple
      of extensions to AF_RXRPC for kernel services to use.  The changes and
      fixes are:
      
       (1) Use time64_t rather than u32 outside of protocol or
           UAPI-representative structures.
      
       (2) Use the correct time stamp when loading a key from an XDR-encoded
           Kerberos 5 key.
      
       (3) Fix IPv6 support.
      
       (4) Fix some places where the error code is being incorrectly made
           positive before returning.
      
       (5) Remove some white space.
      
      And the extensions:
      
       (6) Add an end-of-Tx phase notification, thereby allowing kAFS to
           transition the state on its own call record at the correct point,
           rather than having to do it in advance and risk non-completion of the
           call in the wrong state.
      
       (7) Allow a kernel client call to be retried if it fails on a network
           error, thereby making it possible for kAFS to iterate over a number of
           IP addresses without having to reload the Tx queue and re-encrypt data
           each time.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0fcece7
    • David S. Miller's avatar
      Merge branch 'addrlabel-no-rtnl-locking' · 3d86e352
      David S. Miller authored
      Florian Westphal says:
      
      ====================
      addrlabel: don't use rtnl locking
      
      addrlabel doesn't appear to require rtnl lock as the addrlabel
      table uses a spinlock to serialize add/delete operations.
      
      Also, entries are reference counted so it should be safe
      to call the rtnl ops without the rtnl mutex.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d86e352
    • Florian Westphal's avatar
      addrlabel: add/delete/get can run without rtnl · a6f57028
      Florian Westphal authored
      There appears to be no need to use rtnl, addrlabel entries are refcounted
      and add/delete is serialized by the addrlabel table spinlock.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6f57028
    • Florian Westphal's avatar
    • Greg Kroah-Hartman's avatar
      staging: irda: update MAINTAINERS · 6c766db6
      Greg Kroah-Hartman authored
      Now that the IRDA code has moved under drivers/staging/irda/, update the
      MAINTAINERS file with the new location.
      Reported-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c766db6
    • Sathya Perla's avatar
      bnxt_en: add a dummy definition for bnxt_vf_rep_get_fid() · f143647a
      Sathya Perla authored
      When bnxt VF-reps are not compiled in (CONFIG_BNXT_SRIOV is off)
      bnxt_tc.c needs a dummy definition of the routine bnxt_vf_rep_get_fid().
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Fixes: 2ae7408f ("bnxt_en: bnxt: add TC flower filter offload support")
      Signed-off-by: default avatarSathya Perla <sathya.perla@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f143647a
    • David Howells's avatar
      rxrpc: Allow failed client calls to be retried · c038a58c
      David Howells authored
      Allow a client call that failed on network error to be retried, provided
      that the Tx queue still holds DATA packet 1.  This allows an operation to
      be submitted to another server or another address for the same server
      without having to repackage and re-encrypt the data so far processed.
      
      Two new functions are provided:
      
       (1) rxrpc_kernel_check_call() - This is used to find out the completion
           state of a call to guess whether it can be retried and whether it
           should be retried.
      
       (2) rxrpc_kernel_retry_call() - Disconnect the call from its current
           connection, reset the state and submit it as a new client call to a
           new address.  The new address need not match the previous address.
      
      A call may be retried even if all the data hasn't been loaded into it yet;
      a partially constructed will be retained at the same point it was at when
      an error condition was detected.  msg_data_left() can be used to find out
      how much data was packaged before the error occurred.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c038a58c
    • David Howells's avatar
      rxrpc: Add notification of end-of-Tx phase · e833251a
      David Howells authored
      Add a callback to rxrpc_kernel_send_data() so that a kernel service can get
      a notification that the AF_RXRPC call has transitioned out the Tx phase and
      is now waiting for a reply or a final ACK.
      
      This is called from AF_RXRPC with the call state lock held so the
      notification is guaranteed to come before any reply is passed back.
      
      Further, modify the AFS filesystem to make use of this so that we don't have
      to change the afs_call state before sending the last bit of data.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      e833251a
    • David Howells's avatar
      rxrpc: Remove some excess whitespace · 3ec0efde
      David Howells authored
      Remove indentation from some blank lines.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3ec0efde
    • David Howells's avatar
      rxrpc: Don't negate call->error before returning it · bd2db2d2
      David Howells authored
      call->error is stored as 0 or a negative error code.  Don't negate this
      value (ie. make it positive) before returning it from a kernel function
      (though it should still be negated before passing to userspace through a
      control message).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      bd2db2d2
    • David Howells's avatar
      rxrpc: Fix IPv6 support · 7b674e39
      David Howells authored
      Fix IPv6 support in AF_RXRPC in the following ways:
      
       (1) When extracting the address from a received IPv4 packet, if the local
           transport socket is open for IPv6 then fill out the sockaddr_rxrpc
           struct for an IPv4-mapped-to-IPv6 AF_INET6 transport address instead
           of an AF_INET one.
      
       (2) When sending CHALLENGE or RESPONSE packets, the transport length needs
           to be set from the sockaddr_rxrpc::transport_len field rather than
           sizeof() on the IPv4 transport address.
      
       (3) When processing an IPv4 ICMP packet received by an IPv6 socket, set up
           the address correctly before searching for the affected peer.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      7b674e39
    • David Howells's avatar
      rxrpc: Use correct timestamp from Kerberos 5 ticket · 0a378585
      David Howells authored
      When an XDR-encoded Kerberos 5 ticket is added as an rxrpc-type key, the
      expiry time should be drawn from the k5 part of the token union (which was
      what was filled in), rather than the kad part of the union.
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      0a378585
    • Baolin Wang's avatar
      net: rxrpc: Replace time_t type with time64_t type · 10674a03
      Baolin Wang authored
      Since the 'expiry' variable of 'struct key_preparsed_payload' has been
      changed to 'time64_t' type, which is year 2038 safe on 32bits system.
      
      In net/rxrpc subsystem, we need convert 'u32' type to 'time64_t' type
      when copying ticket expires time to 'prep->expiry', then this patch
      introduces two helper functions to help convert 'u32' to 'time64_t'
      type.
      
      This patch also uses ktime_get_real_seconds() to get current time instead
      of get_seconds() which is not year 2038 safe on 32bits system.
      Signed-off-by: default avatarBaolin Wang <baolin.wang@linaro.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      10674a03
    • Vitaly Kuznetsov's avatar
      hinic: don't build the module by default · c8488a8a
      Vitaly Kuznetsov authored
      We probably don't want to enable code supporting particular hardware by
      default e.g. when someone does 'make defconfig'. Other ethernet modules
      don't do it.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8488a8a
  2. 28 Aug, 2017 18 commits