1. 19 Aug, 2016 12 commits
    • Jiri Kosina's avatar
      net: sched: fix handling of singleton qdiscs with qdisc_hash · 69012ae4
      Jiri Kosina authored
      qdisc_match_from_root() is now iterating over per-netdevice qdisc
      hashtable instead of going through a linked-list of qdiscs (independently
      on the actual underlying netdev), which was the case before the switch to
      hashtable for qdiscs.
      
      For singleton qdiscs, there is no underlying netdev associated though, and
      therefore dumping a singleton qdisc will panic, as qdisc_dev(root) will
      always be NULL.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000410
       IP: [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70
       PGD 1aceba067 PUD 1aceb7067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP
      [ ... ]
       task: ffff8801ec996e00 task.stack: ffff8801ec934000
       RIP: 0010:[<ffffffff8167efac>]  [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70
       RSP: 0018:ffff8801ec937ab0  EFLAGS: 00010203
       RAX: 0000000000000408 RBX: ffff88025e612000 RCX: ffffffffffffffd8
       RDX: 0000000000000000 RSI: 00000000ffff0000 RDI: ffffffff81cf8100
       RBP: ffff8801ec937ab0 R08: 000000000001c160 R09: ffff8802668032c0
       R10: ffffffff81cf8100 R11: 0000000000000030 R12: 00000000ffff0000
       R13: ffff88025e612000 R14: ffffffff81cf3140 R15: 0000000000000000
       FS:  00007f24b9af6740(0000) GS:ffff88026f280000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000410 CR3: 00000001aceec000 CR4: 00000000001406e0
       Stack:
        ffff8801ec937ad0 ffffffff81681210 ffff88025dd51a00 00000000fffffff1
        ffff8801ec937b88 ffffffff81681e4e ffffffff81c42bc0 ffff880262431500
        ffffffff81cf3140 ffff88025dd51a10 ffff88025dd51a24 00000000ec937b38
       Call Trace:
        [<ffffffff81681210>] qdisc_lookup+0x40/0x50
        [<ffffffff81681e4e>] tc_modify_qdisc+0x21e/0x550
        [<ffffffff8166ae25>] rtnetlink_rcv_msg+0x95/0x220
        [<ffffffff81209602>] ? __kmalloc_track_caller+0x172/0x230
        [<ffffffff8166ad90>] ? rtnl_newlink+0x870/0x870
        [<ffffffff816897b7>] netlink_rcv_skb+0xa7/0xc0
        [<ffffffff816657c8>] rtnetlink_rcv+0x28/0x30
        [<ffffffff8168919b>] netlink_unicast+0x15b/0x210
        [<ffffffff81689569>] netlink_sendmsg+0x319/0x390
        [<ffffffff816379f8>] sock_sendmsg+0x38/0x50
        [<ffffffff81638296>] ___sys_sendmsg+0x256/0x260
        [<ffffffff811b1275>] ? __pagevec_lru_add_fn+0x135/0x280
        [<ffffffff811b1a90>] ? pagevec_lru_move_fn+0xd0/0xf0
        [<ffffffff811b1140>] ? trace_event_raw_event_mm_lru_insertion+0x180/0x180
        [<ffffffff811b1b85>] ? __lru_cache_add+0x75/0xb0
        [<ffffffff817708a6>] ? _raw_spin_unlock+0x16/0x40
        [<ffffffff811d8dff>] ? handle_mm_fault+0x39f/0x1160
        [<ffffffff81638b15>] __sys_sendmsg+0x45/0x80
        [<ffffffff81638b62>] SyS_sendmsg+0x12/0x20
        [<ffffffff810038e7>] do_syscall_64+0x57/0xb0
      
      Fix this by special-casing singleton qdiscs (those that don't have
      underlying netdevice) and introduce immediate handling of those rather
      than trying to go over an underlying netdevice. We're in the same
      situation in tc_dump_qdisc_root() and tc_dump_tclass_root().
      
      Ultimately, this will have to be slightly reworked so that we are actually
      able to show singleton qdiscs (noop) in the dump properly; but we're not
      currently doing that anyway, so no regression there, and better do this in
      a gradual manner.
      
      Fixes: 59cc1f61 ("net: sched: convert qdisc linked list to hashtable")
      Reported-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reported-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69012ae4
    • David S. Miller's avatar
      Merge branch 'tipc-next' · e951f145
      David S. Miller authored
      Jon Maloy says:
      
      ====================
      tipc: bearer and link improvements
      
      The first commit makes it possible to set and check the 'blocked' state
      of a bearer from the generic bearer layer. The second commit is a small
      improvement to the link congestion mechanism.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e951f145
    • Jon Paul Maloy's avatar
      tipc: ensure that link congestion and wakeup use same criteria · 5a0950c2
      Jon Paul Maloy authored
      When a link is attempted woken up after congestion, it uses a different,
      more generous criteria than when it was originally declared congested.
      This has the effect that the link, and the sending process, sometimes
      will be woken up unnecessarily, just to immediately return to congestion
      when it turns out there is not not enough space in its send queue to
      host the pending message. This is a waste of CPU cycles.
      
      We now change the function link_prepare_wakeup() to use exactly the same
      criteria as tipc_link_xmit(). However, since we are now excluding the
      window limit from the wakeup calculation, and the current backlog limit
      for the lowest level is too small to house even a single maximum-size
      message, we have to expand this limit. We do this by evaluating an
      alternative, minimum value during the setting of the importance limits.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a0950c2
    • Jon Paul Maloy's avatar
      tipc: make bearer packet filtering generic · 0d051bf9
      Jon Paul Maloy authored
      In commit 5b7066c3 ("tipc: stricter filtering of packets in bearer
      layer") we introduced a method of filtering out messages while a bearer
      is being reset, to avoid that links may be re-created and come back in
      working state while we are still in the process of shutting them down.
      
      This solution works well, but is limited to only work with L2 media, which
      is insufficient with the increasing use of UDP as carrier media.
      
      We now replace this solution with a more generic one, by introducing a
      new flag "up" in the generic struct tipc_bearer. This field will be set
      and reset at the same locations as with the previous solution, while
      the packet filtering is moved to the generic code for the sending side.
      On the receiving side, the filtering is still done in media specific
      code, but now including the UDP bearer.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d051bf9
    • David S. Miller's avatar
      Merge branch 'qed-next' · 37bd91d1
      David S. Miller authored
      Sudarsana Reddy Kalluru says:
      
      ====================
      qed*: Add support for additional statistics.
      
      The patch series adds qed/qede support for new statistics.
      Patch (1) adds couple of statistcs for "ethtool -S" display.
      Patch (2) adds support for per-queue statistics to ethtool display.
      Patch (3) adds qed support for NCSI statistics.
      
      Please consider applying this to 'net-next' branch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37bd91d1
    • Sudarsana Reddy Kalluru's avatar
      qed: Add support for NCSI statistics. · 6c754246
      Sudarsana Reddy Kalluru authored
      The patch adds driver support for sending the NCSI statistics to the
      MFW. This is an asynchronous request from MFW. Upon receiving this, driver
      populates the required data and send it to MFW.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c754246
    • Sudarsana Reddy Kalluru's avatar
    • Sudarsana Reddy Kalluru's avatar
      qede: Add support for capturing additional stats in ethtool-stats display. · 1a5a366f
      Sudarsana Reddy Kalluru authored
      The patch adds driver support for capturing stats ttl0_discard and
      packet_too_big_discard in "ethtool -S" display.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a5a366f
    • Colin Ian King's avatar
      net: atm: remove redundant null pointer check on dev->name · 0d135e4f
      Colin Ian King authored
      dev->name is a char array of IFNAMSIZ elements, hence can never be
      null, so the null pointer check is redundant. Remove it.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d135e4f
    • Appana Durga Kedareswara Rao's avatar
      net: phy: Update copyright info · e202d4c6
      Appana Durga Kedareswara Rao authored
      For implementing this driver most of the inputs is
      provided by Andrew Lunn.
      
      Updating the driver with Andrew Copy right.
      Signed-off-by: default avatarKedareswara rao Appana <appanad@xilinx.com>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e202d4c6
    • shubhrajyoti.datta@xilinx.com's avatar
      net: ethernet: macb: Add support for rx_clk · aead88bd
      shubhrajyoti.datta@xilinx.com authored
      Some of the platforms like zynqmp ultrascale+ has a
      separate clock gate for the rx clock. Add an optional
      rx_clk so that the clock can be enabled.
      Signed-off-by: default avatarShubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aead88bd
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · d52bfbda
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2016-08-18
      
      This series contains updates to i40e and i40evf only.
      
      Wei Yongjun updates i40e to use list_move() instead of list_del() &
      list_add() operations.
      
      Anjali fixes an issue where the client->open call was not protected with
      the client instance mutex, which allowed client->close to be called before
      the open all completed.
      
      Catherine makes sure that the VLAN count (and stats) gets reset to 0
      after reset.
      
      Jake provides two patches, first adds the needed rtnl lock around
      i40evf_set_interrupt_capability() since i40evf_init_task() does not
      hold the rtnl_lock.  Second fixes an issue where users could reduce
      the number of channels (queues) below the current flow director
      filter rules targets.
      
      Dave fixes a problem where a static analysis tool generates a warning
      so eliminating the irrelevant check and redundant assignment for the
      value of enabled_tc.
      
      Avinash fixes an sync issue where the iWARP device open is called
      before the PCI register writes are completed, so ensure the register
      writes complete before exiting the setup function.
      
      Alan fixes a bug which causes RSS to continue to work after being
      disabled.
      
      Carolyn implements a feature change which allows using ethtool to set
      RDD hash options using less than four parameters if desired.
      
      Dan Carpenter cleans up a stray unlock.
      
      Sridhar exposes the "trust" flag to userspace via ndo_get_vf_config().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d52bfbda
  2. 18 Aug, 2016 17 commits
  3. 17 Aug, 2016 11 commits
    • David S. Miller's avatar
      Merge branch 'strparser' · 48433419
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      strp: Stream parser for messages
      
      This patch set introduces a utility for parsing application layer
      protocol messages in a TCP stream. This is a generalization of the
      mechanism implemented of Kernel Connection Multiplexor.
      
      This patch set adapts KCM to use the strparser. We expect that kTLS
      can use this mechanism also. RDS would probably be another candidate
      to use a common stream parsing mechanism.
      
      The API includes a context structure, a set of callbacks, utility
      functions, and a data ready function. The callbacks include
      a parse_msg function that is called to perform parsing (e.g.
      BPF parsing in case of KCM), and a rcv_msg function that is called
      when a full message has been completed.
      
      For strparser we specify the return codes from the parser to allow
      the backend to indicate that control of the socket should be
      transferred back to userspace to handle some exceptions in the
      stream: The return values are:
      
            >0 : indicates length of successfully parsed message
             0  : indicates more data must be received to parse the message
             -ESTRPIPE : current message should not be processed by the
                kernel, return control of the socket to userspace which
                can proceed to read the messages itself
             other < 0 : Error is parsing, give control back to userspace
                assuming that synchronization is lost and the stream
                is unrecoverable (application expected to close TCP socket)
      
      There is one issue I haven't been able to fully resolve. If parse_msg
      returns ESTRPIPE (wants control back to userspace) the parser may
      already have consumed some bytes of the message. There is no way to
      put bytes back into the TCP receive queue and tcp_read_sock does not
      allow an easy way to peek messages. In lieu of a better solution, we
      return ENODATA on the socket to indicate that the data stream is
      unrecoverable (application needs to close socket). This condition
      should only happen if an application layer message header is split
      across two skbuffs and parsing just the first skbuff wasn't sufficient
      to determine the that transfer to userspace is needed.
      
      This patch set contains:
      
        - strparser implementation
        - changes to kcm to use strparser
        - strparser.txt documentation
      
      v2:
        - Add copyright notice to C files
        - Remove GPL module license from strparser.c
        - Add report of rxpause
      
      v3:
        - Restore GPL module license
        - Use EXPORT_SYMBOL_GPL
      
      v4:
        - Removed unused function, changed another to be static as suggested
          by davem
        - Rewoked data_ready to be called from upper layer, no longer requires
          taking over socket data_ready callback as suggested by Lance Chao
      
      Tested:
        - Ran a KCM thrash test for 24 hours. No behavioral or performance
          differences observed.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48433419
    • Tom Herbert's avatar
      strparser: Documentation · adcce4d5
      Tom Herbert authored
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adcce4d5
    • Tom Herbert's avatar
      kcm: Use stream parser · 9b73896a
      Tom Herbert authored
      Adapt KCM to use the stream parser. This mostly involves removing
      the RX handling and setting up the strparser using the interface.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b73896a
    • Tom Herbert's avatar
      strparser: Stream parser for messages · 43a0c675
      Tom Herbert authored
      This patch introduces a utility for parsing application layer protocol
      messages in a TCP stream. This is a generalization of the mechanism
      implemented of Kernel Connection Multiplexor.
      
      The API includes a context structure, a set of callbacks, utility
      functions, and a data ready function.
      
      A stream parser instance is defined by a strparse structure that
      is bound to a TCP socket. The function to initialize the structure
      is:
      
      int strp_init(struct strparser *strp, struct sock *csk,
                    struct strp_callbacks *cb);
      
      csk is the TCP socket being bound to and cb are the parser callbacks.
      
      The upper layer calls strp_tcp_data_ready when data is ready on the lower
      socket for strparser to process. This should be called from a data_ready
      callback that is set on the socket:
      
      void strp_tcp_data_ready(struct strparser *strp);
      
      A parser is bound to a TCP socket by setting data_ready function to
      strp_tcp_data_ready so that all receive indications on the socket
      go through the parser. This is assumes that sk_user_data is set to
      the strparser structure.
      
      There are four callbacks.
       - parse_msg is called to parse the message (returns length or error).
       - rcv_msg is called when a complete message has been received
       - read_sock_done is called when data_ready function exits
       - abort_parser is called to abort the parser
      
      The input to parse_msg is an skbuff which contains next message under
      construction. The backend processing of parse_msg will parse the
      application layer protocol headers to determine the length of
      the message in the stream. The possible return values are:
      
         >0 : indicates length of successfully parsed message
         0  : indicates more data must be received to parse the message
         -ESTRPIPE : current message should not be processed by the
            kernel, return control of the socket to userspace which
            can proceed to read the messages itself
         other < 0 : Error is parsing, give control back to userspace
            assuming that synchronzation is lost and the stream
            is unrecoverable (application expected to close TCP socket)
      
      In the case of error return (< 0) strparse will stop the parser
      and report and error to userspace. The application must deal
      with the error. To handle the error the strparser is unbound
      from the TCP socket. If the error indicates that the stream
      TCP socket is at recoverable point (ESTRPIPE) then the application
      can read the TCP socket to process the stream. Once the application
      has dealt with the exceptions in the stream, it may again bind the
      socket to a strparser to continue data operations.
      
      Note that ENODATA may be returned to the application. In this case
      parse_msg returned -ESTRPIPE, however strparser was unable to maintain
      synchronization of the stream (i.e. some of the message in question
      was already read by the parser).
      
      strp_pause and strp_unpause are used to provide flow control. For
      instance, if rcv_msg is called but the upper layer can't immediately
      consume the message it can hold the message and pause strparser.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43a0c675
    • Thierry Reding's avatar
      net: ipconfig: Fix more use after free · d2d371ae
      Thierry Reding authored
      While commit 9c706a49 ("net: ipconfig: fix use after free") avoids
      the use after free, the resulting code still ends up calling both the
      ic_setup_if() and ic_setup_routes() after calling ic_close_devs(), and
      access to the device is still required.
      
      Move the call to ic_close_devs() to the very end of the function.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2d371ae
    • David S. Miller's avatar
      Merge branch 'tc_action-fixes' · b96c22c0
      David S. Miller authored
      Cong Wang says:
      
      ====================
      net_sched: tc action fixes and updates
      
      This patchset fixes a few regressions caused by the previous
      code refactor and more. Thanks to Jamal for catching them!
      
      Note, patch 3/7 and 4/7 are not strictly necessary for this patchset,
      I just want to carry them together.
      
      ---
      v4: adjust an indention for Jamal
          add two more patches
      
      v3: avoid list for fast path, suggested by Jamal
      
      v2: replace flex_array with regular dynamic array
          keep tcf_action_stats_update() in act_api.h
          fix macro typos found by Amir
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b96c22c0
    • Roman Mashak's avatar
      net_sched: allow flushing tc police actions · b5ac8518
      Roman Mashak authored
      The act_police uses its own code to walk the
      action hashtable, which leads to that we could
      not flush standalone tc police actions, so just
      switch to tcf_generic_walker() like other actions.
      
      (Joint work from Roman and Cong.)
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5ac8518
    • WANG Cong's avatar
      net_sched: unify the init logic for act_police · 0852e455
      WANG Cong authored
      Jamal reported a crash when we create a police action
      with a specific index, this is because the init logic
      is not correct, we should always create one for this
      case. Just unify the logic with other tc actions.
      
      Fixes: a03e6fe5 ("act_police: fix a crash during removal")
      Reported-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0852e455
    • WANG Cong's avatar
      net_sched: convert tcf_exts from list to pointer array · 22dc13c8
      WANG Cong authored
      As pointed out by Jamal, an action could be shared by
      multiple filters, so we can't use list to chain them
      any more after we get rid of the original tc_action.
      Instead, we could just save pointers to these actions
      in tcf_exts, since they are refcount'ed, so convert
      the list to an array of pointers.
      
      The "ugly" part is the action API still accepts list
      as a parameter, I just introduce a helper function to
      convert the array of pointers to a list, instead of
      relying on the C99 feature to iterate the array.
      
      Fixes: a85a970a ("net_sched: move tc_action into tcf_common")
      Reported-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22dc13c8
    • WANG Cong's avatar
      net_sched: move tc offload macros to pkt_cls.h · 2734437e
      WANG Cong authored
      struct tcf_exts belongs to filters, should not be visible
      to plain tc actions.
      
      Cc: Ido Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2734437e
    • WANG Cong's avatar
      net_sched: fix a typo in tc_for_each_action() · 0c23c3e7
      WANG Cong authored
      It is harmless because all users pass 'a' to this macro.
      
      Fixes: 00175aec ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
      Cc: Amir Vadai <amir@vadai.me>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c23c3e7