1. 19 Aug, 2016 40 commits
    • LABBE Corentin's avatar
      atm: fore200e: Do not drop const qualifier · b65b24d4
      LABBE Corentin authored
      The data member of structure firmware is const and this constness is
      dropped by some cast.
      This patch add some const for keeping the const information.
      Signed-off-by: default avatarLABBE Corentin <clabbe.montjoie@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b65b24d4
    • David S. Miller's avatar
      Merge branch 'bpf-next' · f1c89c03
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      BPF helper improvements and cleanups
      
      This set adds various improvements to BPF helpers, a cleanup to use
      skb_pkt_type_ok() helper, addition of bpf_skb_change_tail(), a follow
      up for event output helper and removing ifdefs around the cgroupv2
      helper bits. For details please see individual patches.
      
      The set is based against net-next tree, but requires a merge of net
      into net-next first.
      
      Thanks a lot!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1c89c03
    • Daniel Borkmann's avatar
      bpf: get rid of cgroup helper related ifdefs · 54fd9c2d
      Daniel Borkmann authored
      As recently discussed during the task_under_cgroup_hierarchy() addition,
      we should get rid of the ifdefs surrounding the bpf_skb_under_cgroup()
      helper. If related functionality is not built-in, the helper cannot be
      used anyway, which is also in line with what we do for all other helpers.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54fd9c2d
    • Daniel Borkmann's avatar
      bpf: enable event output helper also for xdp types · 4de16969
      Daniel Borkmann authored
      Follow-up to 555c8a86 ("bpf: avoid stack copy and use skb ctx for
      event output") for also adding the event output helper for XDP typed
      programs. The event output helper has been very useful in particular for
      debugging or event notification purposes, since it's much faster and
      flexible than regular trace printk due to programmatically being able to
      attach meta data. Same flags structure applies as with tc BPF programs.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4de16969
    • Daniel Borkmann's avatar
      bpf: add bpf_skb_change_tail helper · 5293efe6
      Daniel Borkmann authored
      This work adds a bpf_skb_change_tail() helper for tc BPF programs. The
      basic idea is to expand or shrink the skb in a controlled manner. The
      eBPF program can then rewrite the rest via helpers like bpf_skb_store_bytes(),
      bpf_lX_csum_replace() and others rather than passing a raw buffer for
      writing here.
      
      bpf_skb_change_tail() is really a slow path helper and intended for
      replies with f.e. ICMP control messages. Concept is similar to other
      helpers like bpf_skb_change_proto() helper to keep the helper without
      protocol specifics and let the BPF program mangle the remaining parts.
      A flags field has been added and is reserved for now should we extend
      the helper in future.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5293efe6
    • Daniel Borkmann's avatar
      bpf: use skb_pkt_type_ok helper in bpf_skb_change_type · 45c7fffa
      Daniel Borkmann authored
      Since we have a skb_pkt_type_ok() helper for checking the type before
      mangling, make use of it instead of open coding. Follow-up to commit
      8b10cab6 ("net: simplify and make pkt_type_ok() available for other
      users") that came in after d2485c42 ("bpf: add bpf_skb_change_type
      helper").
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45c7fffa
    • Richard Alpe's avatar
      tipc: add peer removal functionality · b3404022
      Richard Alpe authored
      Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
      an offline peer node from the internal data structures.
      
      This will be supported by the tipc user space tool in iproute2.
      Signed-off-by: default avatarRichard Alpe <richard.alpe@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3404022
    • Eric Dumazet's avatar
      tcp: refine tcp_prune_ofo_queue() to not drop all packets · 36a6503f
      Eric Dumazet authored
      Over the years, TCP BDP has increased a lot, and is typically
      in the order of ~10 Mbytes with help of clever Congestion Control
      modules.
      
      In presence of packet losses, TCP stores incoming packets into an out of
      order queue, and number of skbs sitting there waiting for the missing
      packets to be received can match the BDP (~10 Mbytes)
      
      In some cases, TCP needs to make room for incoming skbs, and current
      strategy can simply remove all skbs in the out of order queue as a last
      resort, incurring a huge penalty, both for receiver and sender.
      
      Unfortunately these 'last resort events' are quite frequent, forcing
      sender to send all packets again, stalling the flow and wasting a lot of
      resources.
      
      This patch cleans only a part of the out of order queue in order
      to meet the memory constraints.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: C. Stephen Gun <csg@google.com>
      Cc: Van Jacobson <vanj@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36a6503f
    • Rafał Miłecki's avatar
      net: bgmac: make it clear when setting interface type to RMII · e2d8f646
      Rafał Miłecki authored
      It doesn't really change anything as BGMAC_CHIPCTL_1_IF_TYPE_RMII is
      equal to 0. It make code a bit clener, so far when reading it one could
      think we forgot to set a proper mode. It also keeps this mode code in
      sync with other ones.
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2d8f646
    • Rafał Miłecki's avatar
      net: bgmac: support Ethernet core on BCM53573 SoCs · 1cb94db3
      Rafał Miłecki authored
      BCM53573 is a new series of Broadcom's SoCs. It's based on ARM and can
      be found in two packages (versions): BCM53573 and BCM47189. It shares
      some code with the Northstar family, but also requires some new quirks.
      
      First of all there can be up to 2 Ethernet cores on this SoC. If that is
      the case, they are connected to two different switch ports allowing some
      more complex/optimized setups. It seems the second unit doesn't come
      fully configured and requires some IRQ quirk.
      
      Other than that only the first core is connected to the PHY. For the
      second one we have to register fixed PHY (similarly to the Northstar),
      otherwise generic PHY driver would get some invalid info.
      
      This has been successfully tested on Tenda AC9 (BCM47189B0).
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1cb94db3
    • Colin Ian King's avatar
      net: ethernet: nuvoton: fix spelling mistake: "aligment" -> "alignment" · 6b2a314f
      Colin Ian King authored
      trivial fix to spelling mistake in dev_err message
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b2a314f
    • Paul Durrant's avatar
      xen-netback: create a debugfs node for hash information · c0c64c15
      Paul Durrant authored
      It is useful to be able to see the hash configuration when running tests.
      This patch adds a debugfs node for that purpose.
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0c64c15
    • Edward Cree's avatar
      sfc: avoid division by zero · c7b256f9
      Edward Cree authored
      The division is already being done properly in efx_ef10_get_timer_config
      which returns zero-on-success, unlike the old efx_ef10_get_sysclk_freq.
      
      Fixes: d95e329a ("sfc: get timer configuration from adapter")
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7b256f9
    • Eric Dumazet's avatar
      tcp: defer sacked assignment · dca0aaf8
      Eric Dumazet authored
      While chasing tcp_xmit_retransmit_queue() kasan issue, I found
      that we could avoid reading sacked field of skb that we wont send,
      possibly removing one cache line miss.
      
      Very minor change in slow path, but why not ? ;)
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dca0aaf8
    • David S. Miller's avatar
      Merge branch 'bridge-vlan-stats-with-flags' · 53e43f39
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: export vlan stats per-port with flags
      
      This set adds the ability to export vlan stats per-port. Patch 01 makes
      that possible by consolidating the bridge and port linkxstats calls. Then
      patch 02 allows to dump the vlan entry flags in order to be able to
      distinguish between bridge and port vlan entries when dumping the master
      device vlan stats. That is needed because that call was implemented when
      the stats API didn't have slave dumping capabilities and it dumps all vlan
      stats (for both bridge and port entries). We also need it in order to print
      the vlan flags when dumping the stats.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53e43f39
    • Nikolay Aleksandrov's avatar
      net: bridge: export vlan flags with the stats · 61ba1a2d
      Nikolay Aleksandrov authored
      Use one of the vlan xstats padding fields to export the vlan flags. This is
      needed in order to be able to distinguish between master (bridge) and port
      vlan entries in user-space when dumping the bridge vlan stats.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61ba1a2d
    • Nikolay Aleksandrov's avatar
      net: bridge: consolidate bridge and port linkxstats calls · d5ff8c41
      Nikolay Aleksandrov authored
      In the bridge driver we usually have the same function working for both
      port and bridge. In order to follow that logic and also avoid code
      duplication, consolidate the bridge_ and brport_ linkxstats calls into
      one since they share most of their code. As a side effect this allows us
      to dump the vlan stats also via the slave call which is in preparation for
      the upcoming per-port vlan stats and vlan flag dumping.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5ff8c41
    • David S. Miller's avatar
      Merge branch 'flow-dissector-vlan-tag' · fb36938d
      David S. Miller authored
      Hadar Hen Zion says:
      
      ====================
      net_sched, flow_dissector, flower: Introduce vlan tag support
      
      This patchset introduce vlan tag support to the flower classifier and the flow
      dissector. In addition to adding vlan priority to act vlan.
      
      The first 2 patches are dealing with flow-dissector:
       - The first patch is a fix, in case the vlan was already stripped from the
         skb, take it from skb->vlan_tci.
       - The second patch adds support for vlan priority.
      
      The next 2 patches are dealing with flower:
       - The first patch is a fix, sets flow dissector 'used_keys' according to the
         mask value of each key.
       - The secound patch adds vlan tag support to the flower classifier, user space
         patches will be sent later to complete it.
      
      The last patch adds vlan priority to act vlan since only vlan id is currently supported.
      
      Changes from V1:
       - A new patch was added to this series "net_sched: flower: Avoid dissection of unmasked keys"
       - Adding u16 padding to struct flow_dissector_key_vlan
       - change flow_label field in struct flow_dissector_key_tags form 20 bits field to u32
       - Remove 'if (v->tcfv_push_prio)' check from tcf_vlan_dump function
       - Add support to un-stripped vlan skb and skb with multipale vlans in __skb_flow_dissect
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb36938d
    • Hadar Hen Zion's avatar
      net_sched: act_vlan: Add priority option · 956af371
      Hadar Hen Zion authored
      The current vlan push action supports only vid and protocol options.
      Add priority option.
      
      Example script that adds vlan push action with vid and
      priority:
      
      tc filter add dev veth0 protocol ip parent ffff: \
      	   flower \
      	   	indev veth0 \
      	   action vlan push id 100 priority 5
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      956af371
    • Hadar Hen Zion's avatar
      net_sched: flower: Add vlan support · 9399ae9a
      Hadar Hen Zion authored
      Enhance flower to support 802.1Q vlan protocol classification.
      Currently, the supported fields are vlan_id and vlan_priority.
      
      Example:
      
      	# add a flower filter with vlan id and priority classification
      	tc filter add dev ens4f0 protocol 802.1Q parent ffff: \
      		flower \
      		indev ens4f0 \
      		vlan_ethtype ipv4 \
      		vlan_id 100 \
      		vlan_prio 3 \
      	action vlan pop
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9399ae9a
    • Hadar Hen Zion's avatar
      net_sched: flower: Avoid dissection of unmasked keys · 339ba878
      Hadar Hen Zion authored
      The current flower implementation checks the mask range and set all the
      keys included in that range as "used_keys", even if a specific key in
      the range has a zero mask.
      
      This behavior can cause a false positive return value of
      dissector_uses_key function and unnecessary dissection in
      __skb_flow_dissect.
      
      This patch checks explicitly the mask of each key and "used_keys" will
      be set accordingly.
      
      Fixes: 77b9900e ('tc: introduce Flower classifier')
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      339ba878
    • Hadar Hen Zion's avatar
      flow_dissector: Get vlan priority in addition to vlan id · f6a66927
      Hadar Hen Zion authored
      Add vlan priority check to the flow dissector by adding new flow
      dissector struct, flow_dissector_key_vlan which includes vlan tag
      fields.
      
      vlan_id and flow_label fields were under the same struct
      (flow_dissector_key_tags). It was a convenient setting since struct
      flow_dissector_key_tags is used by struct flow_keys and by setting
      vlan_id and flow_label under the same struct, we get precisely 24 or 48
      bytes in flow_keys from flow_dissector_key_basic.
      
      Now, when adding vlan priority support, the code will be cleaner if
      flow_label and vlan tag won't be under the same struct anymore.
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6a66927
    • Hadar Hen Zion's avatar
      flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci · d5709f7a
      Hadar Hen Zion authored
      Early in the datapath skb_vlan_untag function is called, stripped
      the vlan from the skb and set skb->vlan_tci and skb->vlan_proto fields.
      
      The current dissection doesn't handle stripped vlan packets correctly.
      In some flows, vlan doesn't exist in skb->data anymore when applying
      flow dissection on the skb, fix that.
      
      In case vlan info wasn't stripped before applying flow_dissector (RPS
      flow for example), or in case of skb with multiple vlans (e.g. 802.1ad),
      get the vlan info from skb->data. The flow_dissector correctly skips
      any number of vlans and stores only the first level vlan.
      
      Fixes: 0744dd00 ('net: introduce skb_flow_dissect()')
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5709f7a
    • David S. Miller's avatar
      Merge branch 'qed-link-fixes' · 92bcdcc6
      David S. Miller authored
      Yuval Mintz says:
      
      ====================
      qed*: Fix ethtool issues relating to link
      
      This series addresses two issues that were introduced when adding
      support for ethtool's link_ksettings support - the first fixes a
      regression and second incorrect functionallity in the submission.
      
      Although these are fixes, as the feature currently exists only in
      'next-next' I'm aiming them for it.
      
      Dave, please consider applying this series to 'net-next'.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92bcdcc6
    • Yuval Mintz's avatar
      qede: Fix forcing high speeds · 16d5946a
      Yuval Mintz authored
      While '0xdead' and '0xbeef' are "great" values, we should
      use the correct SPEED_* values instead.
      
      Fixes: 054c67d1 ("qed*: Add support for ethtool link_ksettings callbacks")
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16d5946a
    • Yuval Mintz's avatar
      qed*: Fix pause setting · d194fd26
      Yuval Mintz authored
      When moving into using ethtool's link_ksetting, qed started
      supplying its own bitmask of speed/capabilities, but qede
      is still checking for the SUPPORTED value to determine whether
      it supports pause.
      
      Fixes: 054c67d1 ("qed*: Add support for ethtool link_ksettings callbacks")
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d194fd26
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · a5c88182
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      1GbE Intel Wired LAN Driver Updates 2016-08-18
      
      This series contains updates to igb only.
      
      Gangfeng Huang provides all the changes in the series to update the
      igb driver to support advanced receive side filters that direct receive
      packets by flows to different hardware queues. This enables a tight
      control on routing a flow in the platform.  First patch allows for
      receive network flow classification to insert and remove receive filters
      by ethtool.  Second and third patches add the ability to insert and
      remove ethertype and VLAN priority filters by ethtool.
      
      Last patch just fixes an error message to return "Not supported" versus
      "Unknown error 524".
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5c88182
    • Gangfeng Huang's avatar
      igb: fix error code in igb_add_ethtool_nfc_entry() · 54be8132
      Gangfeng Huang authored
      Use error "rmgr: Cannot insert RX class rule: Operation not supported" is
      more meaningful than "rmgr: Cannot insert RX class rule: Unknown error 524"
      Signed-off-by: default avatarGangfeng Huang <gangfeng.huang@ni.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      54be8132
    • Gangfeng Huang's avatar
      igb: support RX flow classification by VLAN priority · 7a277a96
      Gangfeng Huang authored
      This patch is meant to allow for RX network flow classification to insert
      and remove VLAN priority filter by ethtool
      
      Example:
      Add an VLAN priority filter:
      $ ethtool -N eth0 flow-type ether vlan 0x6000 vlan-mask 0x1FFF action 2 loc 1
      
      Show all filters:
      $ ethtool -n eth0
      4 RX rings available
      Total 1 rules
      
      Filter: 1
      	Flow Type: Raw Ethernet
      	Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
      	Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
      	Ethertype: 0x0 mask: 0xFFFF
      	VLAN EtherType: 0x0 mask: 0xffff
      	VLAN: 0x6000 mask: 0x1fff
      	User-defined: 0x0 mask: 0xffffffffffffffff
      	Action: Direct to queue 2
      
      Delete the filter by location:
      $ ethtool -N delete 1
      Signed-off-by: default avatarRuhao Gao <ruhao.gao@ni.com>
      Signed-off-by: default avatarGangfeng Huang <gangfeng.huang@ni.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      7a277a96
    • Gangfeng Huang's avatar
      igb: support RX flow classification by ethertype · 64c75d41
      Gangfeng Huang authored
      This patch is meant to allow for RX network flow classification to insert
      and remove ethertype filter by ethtool
      
      Example:
      Add an ethertype filter:
      $ ethtool -N eth0 flow-type ether proto 0x88F8 action 2
      
      Show all filters:
      $ ethtool -n eth0
      4 RX rings available
      Total 1 rules
      
      Filter: 15
      	Flow Type: Raw Ethernet
      	Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
      	Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
      	Ethertype: 0x88F8 mask: 0x0
      	Action: Direct to queue 2
      
      Delete the filter by location:
      $ ethtool -N delete 15
      Signed-off-by: default avatarRuhao Gao <ruhao.gao@ni.com>
      Signed-off-by: default avatarGangfeng Huang <gangfeng.huang@ni.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      64c75d41
    • Gangfeng Huang's avatar
      igb: add support of RX network flow classification · 0e71def2
      Gangfeng Huang authored
      This patch is meant to allow for RX network flow classification to insert
      and remove Rx filter by ethtool. Ethtool interface has it's own rules
      manager
      
      Show all filters:
      $ ethtool -n eth0
      4 RX rings available
      Total 2 rules
      Signed-off-by: default avatarRuhao Gao <ruhao.gao@ni.com>
      Signed-off-by: default avatarGangfeng Huang <gangfeng.huang@ni.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0e71def2
    • David S. Miller's avatar
      Merge branch 'qdisc-hash-fixes' · 3e7d2d45
      David S. Miller authored
      Jiri Kosina says:
      
      ====================
      qdisc-hashtable fixes
      
      The following two patches fix all the issues that have been reported
      against the conversion of qdisc linked list to hashtable (currently in
      net-next) so far.
      
      First patch adjusts handling of singleton qdiscs to the new semantics, and
      is rather straightforward.
      
      The second patch, which fixes "cosmetic" issue of duplicate entries in the
      qdisc dump for ingress qdiscs, is a little bit more hairy; I personally
      would love to see all the already existing "if (ingress)"-like hacks go
      away (by, let's say, introducing a general TCQ_F_? flag), but that's way
      out of scope of this patchset (but already on my todo).
      
      Thanks a lot to Daniel Borkmann and David Ahern for reporting the issues
      and testing the patches promptly.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e7d2d45
    • Jiri Kosina's avatar
      net: sched: avoid duplicates in qdisc dump · ea327469
      Jiri Kosina authored
      tc_dump_qdisc() performs dumping of the per-device qdiscs in two phases;
      first, the "standard" dev->qdisc is being dumped. Second, if there is/are
      ingress queue(s), they are being dumped as well.
      
      After conversion of netdevice's qdisc linked-list into hashtable, these
      two sets are not in two disjunctive sets/lists any more, but are both
      "reachable" directly from netdevice's hashtable. As a consequence, the
      "full-depth" dump of the ingress qdiscs results in immediately hitting the
      netdevice hashtable again, and duplicating the dump that has already been
      performed for dev->qdisc.
      What in fact needs to be dumped in case of ingress queue is "just" the
      top-level ingress qdisc, as everything else has been dumped already.
      
      Fix this by extending tc_dump_qdisc_root() in a way that it can be instructed
      whether it should (while performing the "full" per-netdev qdisc dump) perform
      the whole recursion, or just dump "additional" top-level (ingress) qdiscs
      without performing any kind of recursion.
      
      This fixes duplicate dumps such as
      
      	qdisc mq 0: root
      	qdisc pfifo_fast 0: parent :4 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      	qdisc pfifo_fast 0: parent :3 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      	qdisc pfifo_fast 0: parent :2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      	qdisc pfifo_fast 0: parent :1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      	qdisc clsact ffff: parent ffff:fff1
      	qdisc pfifo_fast 0: parent :4 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      	qdisc pfifo_fast 0: parent :3 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      	qdisc pfifo_fast 0: parent :2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      	qdisc pfifo_fast 0: parent :1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
      
      Fixes: 59cc1f61 ("net: sched: convert qdisc linked list to hashtable")
      Reported-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea327469
    • Jiri Kosina's avatar
      net: sched: fix handling of singleton qdiscs with qdisc_hash · 69012ae4
      Jiri Kosina authored
      qdisc_match_from_root() is now iterating over per-netdevice qdisc
      hashtable instead of going through a linked-list of qdiscs (independently
      on the actual underlying netdev), which was the case before the switch to
      hashtable for qdiscs.
      
      For singleton qdiscs, there is no underlying netdev associated though, and
      therefore dumping a singleton qdisc will panic, as qdisc_dev(root) will
      always be NULL.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000410
       IP: [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70
       PGD 1aceba067 PUD 1aceb7067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP
      [ ... ]
       task: ffff8801ec996e00 task.stack: ffff8801ec934000
       RIP: 0010:[<ffffffff8167efac>]  [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70
       RSP: 0018:ffff8801ec937ab0  EFLAGS: 00010203
       RAX: 0000000000000408 RBX: ffff88025e612000 RCX: ffffffffffffffd8
       RDX: 0000000000000000 RSI: 00000000ffff0000 RDI: ffffffff81cf8100
       RBP: ffff8801ec937ab0 R08: 000000000001c160 R09: ffff8802668032c0
       R10: ffffffff81cf8100 R11: 0000000000000030 R12: 00000000ffff0000
       R13: ffff88025e612000 R14: ffffffff81cf3140 R15: 0000000000000000
       FS:  00007f24b9af6740(0000) GS:ffff88026f280000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000410 CR3: 00000001aceec000 CR4: 00000000001406e0
       Stack:
        ffff8801ec937ad0 ffffffff81681210 ffff88025dd51a00 00000000fffffff1
        ffff8801ec937b88 ffffffff81681e4e ffffffff81c42bc0 ffff880262431500
        ffffffff81cf3140 ffff88025dd51a10 ffff88025dd51a24 00000000ec937b38
       Call Trace:
        [<ffffffff81681210>] qdisc_lookup+0x40/0x50
        [<ffffffff81681e4e>] tc_modify_qdisc+0x21e/0x550
        [<ffffffff8166ae25>] rtnetlink_rcv_msg+0x95/0x220
        [<ffffffff81209602>] ? __kmalloc_track_caller+0x172/0x230
        [<ffffffff8166ad90>] ? rtnl_newlink+0x870/0x870
        [<ffffffff816897b7>] netlink_rcv_skb+0xa7/0xc0
        [<ffffffff816657c8>] rtnetlink_rcv+0x28/0x30
        [<ffffffff8168919b>] netlink_unicast+0x15b/0x210
        [<ffffffff81689569>] netlink_sendmsg+0x319/0x390
        [<ffffffff816379f8>] sock_sendmsg+0x38/0x50
        [<ffffffff81638296>] ___sys_sendmsg+0x256/0x260
        [<ffffffff811b1275>] ? __pagevec_lru_add_fn+0x135/0x280
        [<ffffffff811b1a90>] ? pagevec_lru_move_fn+0xd0/0xf0
        [<ffffffff811b1140>] ? trace_event_raw_event_mm_lru_insertion+0x180/0x180
        [<ffffffff811b1b85>] ? __lru_cache_add+0x75/0xb0
        [<ffffffff817708a6>] ? _raw_spin_unlock+0x16/0x40
        [<ffffffff811d8dff>] ? handle_mm_fault+0x39f/0x1160
        [<ffffffff81638b15>] __sys_sendmsg+0x45/0x80
        [<ffffffff81638b62>] SyS_sendmsg+0x12/0x20
        [<ffffffff810038e7>] do_syscall_64+0x57/0xb0
      
      Fix this by special-casing singleton qdiscs (those that don't have
      underlying netdevice) and introduce immediate handling of those rather
      than trying to go over an underlying netdevice. We're in the same
      situation in tc_dump_qdisc_root() and tc_dump_tclass_root().
      
      Ultimately, this will have to be slightly reworked so that we are actually
      able to show singleton qdiscs (noop) in the dump properly; but we're not
      currently doing that anyway, so no regression there, and better do this in
      a gradual manner.
      
      Fixes: 59cc1f61 ("net: sched: convert qdisc linked list to hashtable")
      Reported-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reported-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69012ae4
    • David S. Miller's avatar
      Merge branch 'tipc-next' · e951f145
      David S. Miller authored
      Jon Maloy says:
      
      ====================
      tipc: bearer and link improvements
      
      The first commit makes it possible to set and check the 'blocked' state
      of a bearer from the generic bearer layer. The second commit is a small
      improvement to the link congestion mechanism.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e951f145
    • Jon Paul Maloy's avatar
      tipc: ensure that link congestion and wakeup use same criteria · 5a0950c2
      Jon Paul Maloy authored
      When a link is attempted woken up after congestion, it uses a different,
      more generous criteria than when it was originally declared congested.
      This has the effect that the link, and the sending process, sometimes
      will be woken up unnecessarily, just to immediately return to congestion
      when it turns out there is not not enough space in its send queue to
      host the pending message. This is a waste of CPU cycles.
      
      We now change the function link_prepare_wakeup() to use exactly the same
      criteria as tipc_link_xmit(). However, since we are now excluding the
      window limit from the wakeup calculation, and the current backlog limit
      for the lowest level is too small to house even a single maximum-size
      message, we have to expand this limit. We do this by evaluating an
      alternative, minimum value during the setting of the importance limits.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a0950c2
    • Jon Paul Maloy's avatar
      tipc: make bearer packet filtering generic · 0d051bf9
      Jon Paul Maloy authored
      In commit 5b7066c3 ("tipc: stricter filtering of packets in bearer
      layer") we introduced a method of filtering out messages while a bearer
      is being reset, to avoid that links may be re-created and come back in
      working state while we are still in the process of shutting them down.
      
      This solution works well, but is limited to only work with L2 media, which
      is insufficient with the increasing use of UDP as carrier media.
      
      We now replace this solution with a more generic one, by introducing a
      new flag "up" in the generic struct tipc_bearer. This field will be set
      and reset at the same locations as with the previous solution, while
      the packet filtering is moved to the generic code for the sending side.
      On the receiving side, the filtering is still done in media specific
      code, but now including the UDP bearer.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d051bf9
    • David S. Miller's avatar
      Merge branch 'qed-next' · 37bd91d1
      David S. Miller authored
      Sudarsana Reddy Kalluru says:
      
      ====================
      qed*: Add support for additional statistics.
      
      The patch series adds qed/qede support for new statistics.
      Patch (1) adds couple of statistcs for "ethtool -S" display.
      Patch (2) adds support for per-queue statistics to ethtool display.
      Patch (3) adds qed support for NCSI statistics.
      
      Please consider applying this to 'net-next' branch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37bd91d1
    • Sudarsana Reddy Kalluru's avatar
      qed: Add support for NCSI statistics. · 6c754246
      Sudarsana Reddy Kalluru authored
      The patch adds driver support for sending the NCSI statistics to the
      MFW. This is an asynchronous request from MFW. Upon receiving this, driver
      populates the required data and send it to MFW.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c754246
    • Sudarsana Reddy Kalluru's avatar