1. 12 May, 2017 10 commits
    • David S. Miller's avatar
      Merge branch 'qlcnic-fixes' · 34e934b9
      David S. Miller authored
      Manish Chopra says:
      
      ====================
      qlcnic: Bug fix and update version
      
      This series has one fix and bumps up driver version.
      Please consider applying to "net"
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34e934b9
    • Chopra, Manish's avatar
      qlcnic: Update version to 5.3.66 · 33c16bfd
      Chopra, Manish authored
      Bumping up the version as couple of fixes added after 5.3.65
      Signed-off-by: default avatarManish Chopra <manish.chopra@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33c16bfd
    • Chopra, Manish's avatar
      qlcnic: Fix link configuration with autoneg disabled · f9c3fe2f
      Chopra, Manish authored
      Currently driver returns error on speed configurations
      for 83xx adapter's non XGBE ports, due to this link doesn't
      come up on the ports using 1000Base-T as a connector with
      autoneg disabled. This patch fixes this with initializing
      appropriate port type based on queried module/connector
      types from hardware before any speed/autoneg configuration.
      Signed-off-by: default avatarManish Chopra <manish.chopra@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9c3fe2f
    • Vitaly Kuznetsov's avatar
      xen-netfront: avoid crashing on resume after a failure in talk_to_netback() · d86b5672
      Vitaly Kuznetsov authored
      Unavoidable crashes in netfront_resume() and netback_changed() after a
      previous fail in talk_to_netback() (e.g. when we fail to read MAC from
      xenstore) were discovered. The failure path in talk_to_netback() does
      unregister/free for netdev but we don't reset drvdata and we try accessing
      it after resume.
      
      Fix the bug by removing the whole xen device completely with
      device_unregister(), this guarantees we won't have any calls into netfront
      after a failure.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d86b5672
    • Eric Dumazet's avatar
      net: sched: optimize class dumps · cb395b20
      Eric Dumazet authored
      In commit 59cc1f61 ("net: sched: convert qdisc linked list to
      hashtable") we missed the opportunity to considerably speed up
      tc_dump_tclass_root() if a qdisc handle is provided by user.
      
      Instead of iterating all the qdiscs, use qdisc_match_from_root()
      to directly get the one we look for.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb395b20
    • Yuchung Cheng's avatar
      tcp: avoid fragmenting peculiar skbs in SACK · b451e5d2
      Yuchung Cheng authored
      This patch fixes a bug in splitting an SKB during SACK
      processing. Specifically if an skb contains multiple
      packets and is only partially sacked in the higher sequences,
      tcp_match_sack_to_skb() splits the skb and marks the second fragment
      as SACKed.
      
      The current code further attempts rounding up the first fragment
      to MSS boundaries. But it misses a boundary condition when the
      rounded-up fragment size (pkt_len) is exactly skb size.  Spliting
      such an skb is pointless and causses a kernel warning and aborts
      the SACK processing. This patch universally checks such over-split
      before calling tcp_fragment to prevent these unnecessary warnings.
      
      Fixes: adb92db8 ("tcp: Make SACK code to split only at mss boundaries")
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b451e5d2
    • Eric Dumazet's avatar
      netem: fix skb_orphan_partial() · f6ba8d33
      Eric Dumazet authored
      I should have known that lowering skb->truesize was dangerous :/
      
      In case packets are not leaving the host via a standard Ethernet device,
      but looped back to local sockets, bad things can happen, as reported
      by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 )
      
      So instead of tweaking skb->truesize, lets change skb->destructor
      and keep a reference on the owner socket via its sk_refcnt.
      
      Fixes: f2f872f9 ("netem: Introduce skb_orphan_partial() helper")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMichael Madsen <mkm@nabto.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6ba8d33
    • David S. Miller's avatar
      Merge branch 'generic-xdp-followups' · 4e3c60ed
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      Two generic xdp related follow-ups
      
      Two follow-ups for the generic XDP API, would be great if
      both could still be considered, since the XDP API is not
      frozen yet. For details please see individual patches.
      
      v1 -> v2:
        - Implemented feedback from Jakub Kicinski (reusing
          attribute on dump), thanks!
        - Rest as is.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e3c60ed
    • Daniel Borkmann's avatar
      xdp: refine xdp api with regards to generic xdp · d67b9cd2
      Daniel Borkmann authored
      While working on the iproute2 generic XDP frontend, I noticed that
      as of right now it's possible to have native *and* generic XDP
      programs loaded both at the same time for the case when a driver
      supports native XDP.
      
      The intended model for generic XDP from b5cdae32 ("net: Generic
      XDP") is, however, that only one out of the two can be present at
      once which is also indicated as such in the XDP netlink dump part.
      The main rationale for generic XDP is to ease accessibility (in
      case a driver does not yet have XDP support) and to generically
      provide a semantical model as an example for driver developers
      wanting to add XDP support. The generic XDP option for an XDP
      aware driver can still be useful for comparing and testing both
      implementations.
      
      However, it is not intended to have a second XDP processing stage
      or layer with exactly the same functionality of the first native
      stage. Only reason could be to have a partial fallback for future
      XDP features that are not supported yet in the native implementation
      and we probably also shouldn't strive for such fallback and instead
      encourage native feature support in the first place. Given there's
      currently no such fallback issue or use case, lets not go there yet
      if we don't need to.
      
      Therefore, change semantics for loading XDP and bail out if the
      user tries to load a generic XDP program when a native one is
      present and vice versa. Another alternative to bailing out would
      be to handle the transition from one flavor to another gracefully,
      but that would require to bring the device down, exchange both
      types of programs, and bring it up again in order to avoid a tiny
      window where a packet could hit both hooks. Given this complicates
      the logic for just a debugging feature in the native case, I went
      with the simpler variant.
      
      For the dump, remove IFLA_XDP_FLAGS that was added with b5cdae32
      and reuse IFLA_XDP_ATTACHED for indicating the mode. Dumping all
      or just a subset of flags that were used for loading the XDP prog
      is suboptimal in the long run since not all flags are useful for
      dumping and if we start to reuse the same flag definitions for
      load and dump, then we'll waste bit space. What we really just
      want is to dump the mode for now.
      
      Current IFLA_XDP_ATTACHED semantics are: nothing was installed (0),
      a program is running at the native driver layer (1). Thus, add a
      mode that says that a program is running at generic XDP layer (2).
      Applications will handle this fine in that older binaries will
      just indicate that something is attached at XDP layer, effectively
      this is similar to IFLA_XDP_FLAGS attr that we would have had
      modulo the redundancy.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d67b9cd2
    • Daniel Borkmann's avatar
      xdp: add flag to enforce driver mode · 0489df9a
      Daniel Borkmann authored
      After commit b5cdae32 ("net: Generic XDP") we automatically fall
      back to a generic XDP variant if the driver does not support native
      XDP. Allow for an option where the user can specify that always the
      native XDP variant should be selected and in case it's not supported
      by a driver, just bail out.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0489df9a
  2. 11 May, 2017 19 commits
    • David S. Miller's avatar
      bpf: Provide a linux/types.h override for bpf selftests. · 0a5539f6
      David S. Miller authored
      We do not want to use the architecture's type.h header when
      building BPF programs which are always 64-bit.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a5539f6
    • David S. Miller's avatar
      Merge branch 'bpf-pkt-ptr-align' · 228b0324
      David S. Miller authored
      David S. Miller says:
      
      ====================
      bpf: Add alignment tracker to verifier.
      
      First we add the alignment tracking logic to the verifier.
      
      Next, we work on building up infrastructure to facilitate regression
      testing of this facility.
      
      Finally, we add the "test_align" test case.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      228b0324
    • David S. Miller's avatar
    • David S. Miller's avatar
      bpf: Add bpf_verify_program() to the library. · 91045f5e
      David S. Miller authored
      This allows a test case to load a BPF program and unconditionally
      acquire the verifier log.
      
      It also allows specification of the strict alignment flag.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      91045f5e
    • David S. Miller's avatar
      bpf: Add strict alignment flag for BPF_PROG_LOAD. · e07b98d9
      David S. Miller authored
      Add a new field, "prog_flags", and an initial flag value
      BPF_F_STRICT_ALIGNMENT.
      
      When set, the verifier will enforce strict pointer alignment
      regardless of the setting of CONFIG_EFFICIENT_UNALIGNED_ACCESS.
      
      The verifier, in this mode, will also use a fixed value of "2" in
      place of NET_IP_ALIGN.
      
      This facilitates test cases that will exercise and validate this part
      of the verifier even when run on architectures where alignment doesn't
      matter.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e07b98d9
    • David S. Miller's avatar
      bpf: Do per-instruction state dumping in verifier when log_level > 1. · c5fc9692
      David S. Miller authored
      If log_level > 1, do a state dump every instruction and emit it in
      a more compact way (without a leading newline).
      
      This will facilitate more sophisticated test cases which inspect the
      verifier log for register state.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      c5fc9692
    • David S. Miller's avatar
      bpf: Track alignment of register values in the verifier. · d1174416
      David S. Miller authored
      Currently if we add only constant values to pointers we can fully
      validate the alignment, and properly check if we need to reject the
      program on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.
      
      However, once an unknown value is introduced we only allow byte sized
      memory accesses which is too restrictive.
      
      Add logic to track the known minimum alignment of register values,
      and propagate this state into registers containing pointers.
      
      The most common paradigm that makes use of this new logic is computing
      the transport header using the IP header length field.  For example:
      
      	struct ethhdr *ep = skb->data;
      	struct iphdr *iph = (struct iphdr *) (ep + 1);
      	struct tcphdr *th;
       ...
      	n = iph->ihl;
      	th = ((void *)iph + (n * 4));
      	port = th->dest;
      
      The existing code will reject the load of th->dest because it cannot
      validate that the alignment is at least 2 once "n * 4" is added the
      the packet pointer.
      
      In the new code, the register holding "n * 4" will have a reg->min_align
      value of 4, because any value multiplied by 4 will be at least 4 byte
      aligned.  (actually, the eBPF code emitted by the compiler in this case
      is most likely to use a shift left by 2, but the end result is identical)
      
      At the critical addition:
      
      	th = ((void *)iph + (n * 4));
      
      The register holding 'th' will start with reg->off value of 14.  The
      pointer addition will transform that reg into something that looks like:
      
      	reg->aux_off = 14
      	reg->aux_off_align = 4
      
      Next, the verifier will look at the th->dest load, and it will see
      a load offset of 2, and first check:
      
      	if (reg->aux_off_align % size)
      
      which will pass because aux_off_align is 4.  reg_off will be computed:
      
      	reg_off = reg->off;
       ...
      		reg_off += reg->aux_off;
      
      plus we have off==2, and it will thus check:
      
      	if ((NET_IP_ALIGN + reg_off + off) % size != 0)
      
      which evaluates to:
      
      	if ((NET_IP_ALIGN + 14 + 2) % size != 0)
      
      On strict alignment architectures, NET_IP_ALIGN is 2, thus:
      
      	if ((2 + 14 + 2) % size != 0)
      
      which passes.
      
      These pointer transformations and checks work regardless of whether
      the constant offset or the variable with known alignment is added
      first to the pointer register.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d1174416
    • Daniel Borkmann's avatar
      bpf, arm64: fix faulty emission of map access in tail calls · d8b54110
      Daniel Borkmann authored
      Shubham was recently asking on netdev why in arm64 JIT we don't multiply
      the index for accessing the tail call map by 8. That led me into testing
      out arm64 JIT wrt tail calls and it turned out I got a NULL pointer
      dereference on the tail call.
      
      The buggy access is at:
      
        prog = array->ptrs[index];
        if (prog == NULL)
            goto out;
      
        [...]
        00000060:  d2800e0a  mov x10, #0x70 // #112
        00000064:  f86a682a  ldr x10, [x1,x10]
        00000068:  f862694b  ldr x11, [x10,x2]
        0000006c:  b40000ab  cbz x11, 0x00000080
        [...]
      
      The code triggering the crash is f862694b. x1 at the time contains the
      address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning,
      above we load the pointer to the program at map slot 0 into x10. x10
      can then be NULL if the slot is not occupied, which we later on try to
      access with a user given offset in x2 that is the map index.
      
      Fix this by emitting the following instead:
      
        [...]
        00000060:  d2800e0a  mov x10, #0x70 // #112
        00000064:  8b0a002a  add x10, x1, x10
        00000068:  d37df04b  lsl x11, x2, #3
        0000006c:  f86b694b  ldr x11, [x10,x11]
        00000070:  b40000ab  cbz x11, 0x00000084
        [...]
      
      This basically adds the offset to ptrs to the base address of the bpf
      array we got and we later on access the map with an index * 8 offset
      relative to that. The tail call map itself is basically one large area
      with meta data at the head followed by the array of prog pointers.
      This makes tail calls working again, tested on Cavium ThunderX ARMv8.
      
      Fixes: ddb55992 ("arm64: bpf: implement bpf_tail_call() helper")
      Reported-by: default avatarShubham Bansal <illusionist.neo@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8b54110
    • Ivan Khoronzhuk's avatar
      net: ethernet: ti: netcp_core: return error while dma channel open issue · 5b6cb43b
      Ivan Khoronzhuk authored
      Fix error path while dma open channel issue. Also, no need to check output
      on NULL if it's never returned.
      Signed-off-by: default avatarIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b6cb43b
    • David S. Miller's avatar
      Merge branch 's390-net-fixes' · dc319c4b
      David S. Miller authored
      Julian Wiedmann says:
      
      ====================
      s390/net fixes
      
      some qeth fixes for -net, the OSM/OSN one being the most crucial.
      Please also queue these up for stable.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc319c4b
    • Ursula Braun's avatar
      s390/qeth: add missing hash table initializations · ebccc739
      Ursula Braun authored
      commit 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      added new hash tables, but missed to initialize them.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Reviewed-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebccc739
    • Julian Wiedmann's avatar
      s390/qeth: avoid null pointer dereference on OSN · 25e2c341
      Julian Wiedmann authored
      Access card->dev only after checking whether's its valid.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Reviewed-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25e2c341
    • Julian Wiedmann's avatar
      s390/qeth: unbreak OSM and OSN support · 2d2ebb3e
      Julian Wiedmann authored
      commit b4d72c08 ("qeth: bridgeport support - basic control")
      broke the support for OSM and OSN devices as follows:
      
      As OSM and OSN are L2 only, qeth_core_probe_device() does an early
      setup by loading the l2 discipline and calling qeth_l2_probe_device().
      In this context, adding the l2-specific bridgeport sysfs attributes
      via qeth_l2_create_device_attributes() hits a BUG_ON in fs/sysfs/group.c,
      since the basic sysfs infrastructure for the device hasn't been
      established yet.
      
      Note that OSN actually has its own unique sysfs attributes
      (qeth_osn_devtype), so the additional attributes shouldn't be created
      at all.
      For OSM, add a new qeth_l2_devtype that contains all the common
      and l2-specific sysfs attributes.
      When qeth_core_probe_device() does early setup for OSM or OSN, assign
      the corresponding devtype so that the ccwgroup probe code creates the
      full set of sysfs attributes.
      This allows us to skip qeth_l2_create_device_attributes() in case
      of an early setup.
      
      Any device that can't do early setup will initially have only the
      generic sysfs attributes, and when it's probed later
      qeth_l2_probe_device() adds the l2-specific attributes.
      
      If an early-setup device is removed (by calling ccwgroup_ungroup()),
      device_unregister() will - using the devtype - delete the
      l2-specific attributes before qeth_l2_remove_device() is called.
      So make sure to not remove them twice.
      
      What complicates the issue is that qeth_l2_probe_device() and
      qeth_l2_remove_device() is also called on a device when its
      layer2 attribute changes (ie. its layer mode is switched).
      For early-setup devices this wouldn't work properly - we wouldn't
      remove the l2-specific attributes when switching to L3.
      But switching the layer mode doesn't actually make any sense;
      we already decided that the device can only operate in L2!
      So just refuse to switch the layer mode on such devices. Note that
      OSN doesn't have a layer2 attribute, so we only need to special-case
      OSM.
      
      Based on an initial patch by Ursula Braun.
      
      Fixes: b4d72c08 ("qeth: bridgeport support - basic control")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d2ebb3e
    • Ursula Braun's avatar
      s390/qeth: handle sysfs error during initialization · 9111e788
      Ursula Braun authored
      When setting up the device from within the layer discipline's
      probe routine, creating the layer-specific sysfs attributes can fail.
      Report this error back to the caller, and handle it by
      releasing the layer discipline.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      [jwi: updated commit msg, moved an OSN change to a subsequent patch]
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9111e788
    • Jon Mason's avatar
      mdio: mux: Correct mdio_mux_init error path issues · b6016166
      Jon Mason authored
      There is a potential unnecessary refcount decrement on error path of
      put_device(&pb->mii_bus->dev), as it is possible to avoid the
      of_mdio_find_bus() call if mux_bus is specified by the calling function.
      
      The same put_device() is not called in the error path if the
      devm_kzalloc of pb fails.  This caused the variable used in the
      put_device() to be changed, as the pb pointer was obviously not set up.
      
      There is an unnecessary of_node_get() on child_bus_node if the
      of_mdiobus_register() is successful, as the
      for_each_available_child_of_node() automatically increments this.
      Thus the refcount on this node will always be +1 more than it should be.
      
      There is no of_node_put() on child_bus_node if the of_mdiobus_register()
      call fails.
      
      Finally, it is lacking devm_kfree() of pb in the error path.  While this
      might not be technically necessary, it was present in other parts of the
      function.  So, I am adding it where necessary to make it uniform.
      Signed-off-by: default avatarJon Mason <jon.mason@broadcom.com>
      Fixes: f20e6657 ("mdio: mux: Enhanced MDIO mux framework for integrated multiplexers")
      Fixes: 0ca2997d ("netdev/of/phy: Add MDIO bus multiplexer support.")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6016166
    • WANG Cong's avatar
      ipv6/dccp: do not inherit ipv6_mc_list from parent · 83eaddab
      WANG Cong authored
      Like commit 657831ff ("dccp/tcp: do not inherit mc_list from parent")
      we should clear ipv6_mc_list etc. for IPv6 sockets too.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83eaddab
    • Colin Ian King's avatar
      netxen_nic: set rcode to the return status from the call to netxen_issue_cmd · 0fe20faf
      Colin Ian King authored
      Currently rcode is being initialized to NX_RCODE_SUCCESS and later it
      is checked to see if it is not NX_RCODE_SUCCESS which is never true. It
      appears that there is an unintentional missing assignment of rcode from
      the return of the call to netxen_issue_cmd() that was dropped in
      an earlier fix, so add it in.
      
      Detected by CoverityScan, CID#401900 ("Logically dead code")
      
      Fixes: 2dcd5d95 ("netxen_nic: fix cdrp race condition")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fe20faf
    • Stefan Wahren's avatar
      net: qca_spi: Fix alignment issues in rx path · 8d66c30b
      Stefan Wahren authored
      The qca_spi driver causes alignment issues on ARM devices.
      So fix this by using netdev_alloc_skb_ip_align().
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Fixes: 291ab06e ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d66c30b
    • Gao Feng's avatar
      driver: vrf: Fix one possible use-after-free issue · 1a4a5bf5
      Gao Feng authored
      The current codes only deal with the case that the skb is dropped, it
      may meet one use-after-free issue when NF_HOOK returns 0 that means
      the skb is stolen by one netfilter rule or hook.
      
      When one netfilter rule or hook stoles the skb and return NF_STOLEN,
      it means the skb is taken by the rule, and other modules should not
      touch this skb ever. Maybe the skb is queued or freed directly by the
      rule.
      
      Now uses the nf_hook instead of NF_HOOK to get the result of netfilter,
      and check the return value of nf_hook. Only when its value equals 1, it
      means the skb could go ahead. Or reset the skb as NULL.
      
      BTW, because vrf_rcv_finish is empty function, so needn't invoke it
      even though nf_hook returns 1. But we need to modify vrf_rcv_finish
      to deal with the NF_STOLEN case.
      
      There are two cases when skb is stolen.
      1. The skb is stolen and freed directly.
         There is nothing we need to do, and vrf_rcv_finish isn't invoked.
      2. The skb is queued and reinjected again.
         The vrf_rcv_finish would be invoked as okfn, so need to free the
         skb in it.
      Signed-off-by: default avatarGao Feng <gfree.wind@vip.163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a4a5bf5
  3. 09 May, 2017 11 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide · 56868a46
      Linus Torvalds authored
      Pull IDE updates from David Miller:
       "Two small cleanups in the IDE layer"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
        ide: don't call memcpy with the same source and destination
        ide: use setup_timer
      56868a46
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 7fc22f45
      Linus Torvalds authored
      Pull sparc updates from David Miller:
       "sparc changes, including a bug fix for handling exceptions during
        bzero on some sparc64 cpus"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: fix fault handling in NGbzero.S and GENbzero.S
        sparc: use memdup_user_nul in sun4m LED driver
        sparc: Remove redundant tests in boot_flags_init().
      7fc22f45
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 50fb55d8
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix multiqueue in stmmac driver on PCI, from Andy Shevchenko.
      
       2) cdc_ncm doesn't actually fully zero out the padding area is
          allocates on TX, from Jim Baxter.
      
       3) Don't leak map addresses in BPF verifier, from Daniel Borkmann.
      
       4) If we randomize TCP timestamps, we have to do it everywhere
          including SYN cookies. From Eric Dumazet.
      
       5) Fix "ethtool -S" crash in aquantia driver, from Pavel Belous.
      
       6) Fix allocation size for ntp filter bitmap in bnxt_en driver, from
          Dan Carpenter.
      
       7) Add missing memory allocation return value check to DSA loop driver,
          from Christophe Jaillet.
      
       8) Fix XDP leak on driver unload in qed driver, from Suddarsana Reddy
          Kalluru.
      
       9) Don't inherit MC list from parent inet connection sockets, another
          syzkaller spotted gem. Fix from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
        dccp/tcp: do not inherit mc_list from parent
        qede: Split PF/VF ndos.
        qed: Correct doorbell configuration for !4Kb pages
        qed: Tell QM the number of tasks
        qed: Fix VF removal sequence
        qede: Fix XDP memory leak on unload
        net/mlx4_core: Reduce harmless SRIOV error message to debug level
        net/mlx4_en: Avoid adding steering rules with invalid ring
        net/mlx4_en: Change the error print to debug print
        drivers: net: wimax: i2400m: i2400m-usb: Use time_after for time comparison
        DECnet: Use container_of() for embedded struct
        Revert "ipv4: restore rt->fi for reference counting"
        net: mdio-mux: bcm-iproc: call mdiobus_free() in error path
        net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
        ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
        net: cdc_ncm: Fix TX zero padding
        stmmac: pci: split out common_default_data() helper
        stmmac: pci: RX queue routing configuration
        stmmac: pci: TX and RX queue priority configuration
        stmmac: pci: set default number of rx and tx queues
        ...
      50fb55d8
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-4.12-rc1' of git://git.infradead.org/users/vkoul/slave-dma · 4879b7ae
      Linus Torvalds authored
      Pull dmaengine updates from Vinod Koul:
       "This time again a smaller update consisting of:
      
         - support for TI DA8xx dma controller and updates to the cppi driver
      
         - updates on bunch of drivers like xilinx, pl08x, stm32-dma, mv_xor,
           ioat, dmatest"
      
      * tag 'dmaengine-4.12-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (35 commits)
        dmaengine: pl08x: remove lock documentation
        dmaengine: pl08x: fix pl08x_dma_chan_state documentation
        dmaengine: pl08x: Use the BIT() macro consistently
        dmaengine: pl080: Fix some missing kerneldoc
        dmaengine: pl080: Cut some unused defines
        dmaengine: dmatest: Add check for supported buffer count (sg_buffers)
        dmaengine: dmatest: Select DMA_ENGINE_RAID as its needed for the slave_sg test
        dmaengine: virt-dma: Convert to use list_for_each_entry_safe()
        dma-debug: use offset_in_page() macro
        dmaengine: mv_xor: use offset_in_page() macro
        dmaengine: dmatest: use offset_in_page() macro
        dmaengine: sun4i: fix invalid argument
        dmaengine: ioat: use setup_timer
        dmaengine: cppi41: Fix an Oops happening in cppi41_dma_probe()
        dmaengine: pl330: remove pdata based initialization
        dmaengine: cppi: fix build error due to bad variable
        dmaengine: imx-sdma: add 1ms delay to ensure SDMA channel is stopped
        dmaengine: cppi41: use managed functions devm_*()
        dmaengine: cppi41: fix cppi41_dma_tx_status() logic
        dmaengine: qcom_hidma: pause the channel on shutdown
        ...
      4879b7ae
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-4.12-rc1' of... · ecc721a7
      Linus Torvalds authored
      Merge tag 'pwm/for-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "Adds a new driver for the PWM controller found on MediaTek SoCs and
        extends support for the Atmel PWM controller to include the SAMA5D2.
      
        Some existing drivers have been migrated to the atomic API and a few
        others see miscellaneous improvements"
      
      * tag 'pwm/for-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        pwm: tegra: Read PWM clock source rate in driver init
        pwm: pca9685: Fix GPIO-only operation
        pwm: mediatek: Don't explicitly set .owner
        pwm: tegra: Avoid potential overflow for short periods
        pwm: tegra: Add support to configure pin state in suspends/resume
        pwm: tegra: Add DT binding details to configure pin in suspends/resume
        pwm: tegra: Increase precision in PWM rate calculation
        pwm: tegra: Use DIV_ROUND_CLOSEST_ULL() instead of local implementation
        pwm: Add MediaTek PWM support
        dt-bindings: pwm: Add MediaTek PWM bindings
        pwm: atmel: Enable PWM on sama5d2
        pwm: atmel: Switch to atomic PWM
        pwm: atmel-hlcdc: Implement the suspend/resume hooks
        pwm: atmel-hlcdc: Convert to the atomic PWM API
      ecc721a7
    • Linus Torvalds's avatar
      Merge tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 28b47809
      Linus Torvalds authored
      Pull IOMMU updates from Joerg Roedel:
      
       - code optimizations for the Intel VT-d driver
      
       - ability to switch off a previously enabled Intel IOMMU
      
       - support for 'struct iommu_device' for OMAP, Rockchip and Mediatek
         IOMMUs
      
       - header optimizations for IOMMU core code headers and a few fixes that
         became necessary in other parts of the kernel because of that
      
       - ACPI/IORT updates and fixes
      
       - Exynos IOMMU optimizations
      
       - updates for the IOMMU dma-api code to bring it closer to use per-cpu
         iova caches
      
       - new command-line option to set default domain type allocated by the
         iommu core code
      
       - another command line option to allow the Intel IOMMU switched off in
         a tboot environment
      
       - ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using an
         IDENTITY domain in conjunction with DMA ops, Support for SMR masking,
         Support for 16-bit ASIDs (was previously broken)
      
       - various other small fixes and improvements
      
      * tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (63 commits)
        soc/qbman: Move dma-mapping.h include to qman_priv.h
        soc/qbman: Fix implicit header dependency now causing build fails
        iommu: Remove trace-events include from iommu.h
        iommu: Remove pci.h include from trace/events/iommu.h
        arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
        ACPI/IORT: Fix CONFIG_IOMMU_API dependency
        iommu/vt-d: Don't print the failure message when booting non-kdump kernel
        iommu: Move report_iommu_fault() to iommu.c
        iommu: Include device.h in iommu.h
        x86, iommu/vt-d: Add an option to disable Intel IOMMU force on
        iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed
        iommu/arm-smmu: Correct sid to mask
        iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
        iommu: Make iommu_bus_notifier return NOTIFY_DONE rather than error code
        omap3isp: Remove iommu_group related code
        iommu/omap: Add iommu-group support
        iommu/omap: Make use of 'struct iommu_device'
        iommu/omap: Store iommu_dev pointer in arch_data
        iommu/omap: Move data structures to omap-iommu.h
        iommu/omap: Drop legacy-style device support
        ...
      28b47809
    • Eric Dumazet's avatar
      dccp/tcp: do not inherit mc_list from parent · 657831ff
      Eric Dumazet authored
      syzkaller found a way to trigger double frees from ip_mc_drop_socket()
      
      It turns out that leave a copy of parent mc_list at accept() time,
      which is very bad.
      
      Very similar to commit 8b485ce6 ("tcp: do not inherit
      fastopen_req from parent")
      
      Initial report from Pray3r, completed by Andrey one.
      Thanks a lot to them !
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarPray3r <pray3r.z@gmail.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      657831ff
    • Dave Aldridge's avatar
      sparc64: fix fault handling in NGbzero.S and GENbzero.S · 3c7f6221
      Dave Aldridge authored
      When any of the functions contained in NGbzero.S and GENbzero.S
      vector through *bzero_from_clear_user, we may end up taking a
      fault when executing one of the store alternate address space
      instructions. If this happens, the exception handler does not
      restore the %asi register.
      
      This commit fixes the issue by introducing a new exception
      handler that ensures the %asi register is restored when
      a fault is handled.
      
      Orabug: 25577560
      Signed-off-by: default avatarDave Aldridge <david.j.aldridge@oracle.com>
      Reviewed-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Reviewed-by: default avatarBabu Moger <babu.moger@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c7f6221
    • Geliang Tang's avatar
      sparc: use memdup_user_nul in sun4m LED driver · aed74ea0
      Geliang Tang authored
      Use memdup_user_nul() helper instead of open-coding to simplify the code.
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aed74ea0
    • Linus Torvalds's avatar
      Merge tag 'arc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 4a1e31c6
      Linus Torvalds authored
      Pull ARC updates from Vineet Gupta:
      
       - AXS10x platform clk updates for I2S, PGU
      
       - add region based cache flush operation for ARCv2 cores
      
       - enforce PAE40 dependency on HIGHMEM
      
       - ptrace support for additional regs in ARCv2 cores
      
       - fix build failure in linux-next dut to a header include ordering
         change
      
      * tag 'arc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        Revert "ARCv2: Allow enabling PAE40 w/o HIGHMEM"
        ARC: mm: fix build failure in linux-next for UP builds
        ARCv2: ptrace: provide regset for accumulator/r30 regs
        elf: Add ARCv2 specific core note section
        ARCv2: mm: micro-optimize region flush generated code
        ARCv2: mm: Merge 2 updates to DC_CTRL for region flush
        ARCv2: mm: Implement cache region flush operations
        ARC: mm: Move full_page computation into cache version agnostic wrapper
        arc: axs10x: Fix ARC PGU default clock frequency
        arc: axs10x: Add DT bindings for I2S audio playback
      4a1e31c6
    • Linus Torvalds's avatar
      Merge tag 'armsoc-dt64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · c6778ff8
      Linus Torvalds authored
      Pull ARM 64-bit DT updates from Olof Johansson:
       "Device-tree updates for arm64 platforms. Just as with 32-bit, a bunch
        of smaller changes, but also some new platforms that are worth
        mentioning:
      
         - Rockchip RK3399 platforms for Chromebooks, including Samsung
           Chromebook Plus (Kevin)
      
         - Orange Pi PC2 (Allwinner H5)
      
         - Freescale LS2088A and LS1088A SoCs
      
         - Expanded support for Nvidia Tegra186 (and Jetson TX2)"
      
      * tag 'armsoc-dt64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (180 commits)
        arm64: dts: Add basic DT to support Spreadtrum's SP9860G
        arm64: dts: exynos: Use - instead of @ for DT OPP entries
        arm64: dts: exynos: Add support for s6e3hf2 panel device on TM2e board
        arm64: dts: juno: add information about L1 and L2 caches
        arm64: dts: juno: fix few unit address format warnings
        arm64: marvell: dts: enable the crypto engine on the Armada 8040 DB
        arm64: marvell: dts: enable the crypto engine on the Armada 7040 DB
        arm64: marvell: dts: add crypto engine description for 7k/8k
        arm64: dts: marvell: add sdhci support for Armada 7K/8K
        arm64: dts: marvell: add eMMC support for Armada 37xx
        arm64: dts: hisi: add pinctrl dtsi file for HiKey960 development board
        arm64: dts: hisi: add drive strength levels of the pins for Hi3660 SoC
        arm64: dts: hisi: enable the NIC and SAS for the hip07-d05 board
        arm64: dts: hisi: add SAS nodes for the hip07 SoC
        arm64: dts: hisi: add RoCE nodes for the hip07 SoC
        arm64: dts: hisi: add network related nodes for the hip07 SoC
        arm64: dts: hisi: add mbigen nodes for the hip07 SoC
        arm64: dts: rockchip: fix the memory size of PX5 Evaluation board
        arm64: dts: hisilicon: add dts files for hi3798cv200-poplar board
        dt-bindings: arm: hisilicon: add bindings for hi3798cv200 SoC and Poplar board
        ...
      c6778ff8