Commits · 2db2d9d1ac37222cd9cd9b59bfd7f9cea3bd2457 · nexedi / linux

30 May, 2019 5 commits

net: phy: Guard against the presence of a netdev · 2db2d9d1

Ioana Ciornei authored May 28, 2019

A prerequisite for PHYLIB to work in the absence of a struct net_device
is to not access pointers to it.

Changes are needed in the following areas:

 - Printing: In some places netdev_err was replaced with phydev_err.

 - Incrementing reference count to the parent MDIO bus driver: If there
   is no net device, then the reference count should definitely be
   incremented since there is no chance that it was an Ethernet driver
   who registered the MDIO bus.

 - Sysfs links are not created in case there is no attached_dev.

 - No netif_carrier_off is done if there is no attached_dev.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2db2d9d1

net: phy: Add phy_sysfs_create_links helper function · 53cfca2d

Vladimir Oltean authored May 28, 2019

This is a cosmetic patch that wraps the operation of creating sysfs
links between the netdev->phydev and the phydev->attached_dev.

This is needed to keep the indentation level in check in a follow-up
patch where this function will be guarded against the existence of a
phydev->attached_dev.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

53cfca2d

net: sched: Introduce act_ctinfo action · 24ec483c

Kevin 'ldir' Darbyshire-Bryant authored May 28, 2019

ctinfo is a new tc filter action module.  It is designed to restore
information contained in firewall conntrack marks to other packet fields
and is typically used on packet ingress paths.  At present it has two
independent sub-functions or operating modes, DSCP restoration mode &
skb mark restoration mode.

The DSCP restore mode:

This mode copies DSCP values that have been placed in the firewall
conntrack mark back into the IPv4/v6 diffserv fields of relevant
packets.

The DSCP restoration is intended for use and has been found useful for
restoring ingress classifications based on egress classifications across
links that bleach or otherwise change DSCP, typically home ISP Internet
links.  Restoring DSCP on ingress on the WAN link allows qdiscs such as
but by no means limited to CAKE to shape inbound packets according to
policies that are easier to set & mark on egress.

Ingress classification is traditionally a challenging task since
iptables rules haven't yet run and tc filter/eBPF programs are pre-NAT
lookups, hence are unable to see internal IPv4 addresses as used on the
typical home masquerading gateway.  Thus marking the connection in some
manner on egress for later restoration of classification on ingress is
easier to implement.

Parameters related to DSCP restore mode:

dscpmask - a 32 bit mask of 6 contiguous bits and indicate bits of the
conntrack mark field contain the DSCP value to be restored.

statemask - a 32 bit mask of (usually) 1 bit length, outside the area
specified by dscpmask.  This represents a conditional operation flag
whereby the DSCP is only restored if the flag is set.  This is useful to
implement a 'one shot' iptables based classification where the
'complicated' iptables rules are only run once to classify the
connection on initial (egress) packet and subsequent packets are all
marked/restored with the same DSCP.  A mask of zero disables the
conditional behaviour ie. the conntrack mark DSCP bits are always
restored to the ip diffserv field (assuming the conntrack entry is found
& the skb is an ipv4/ipv6 type)

e.g. dscpmask 0xfc000000 statemask 0x01000000

|----0xFC----conntrack mark----000000---|
| Bits 31-26 | bit 25 | bit24 |~~~ Bit 0|
| DSCP       | unused | flag  |unused   |
|-----------------------0x01---000000---|
      |                   |
      |                   |
      ---|             Conditional flag
         v             only restore if set
|-ip diffserv-|
| 6 bits      |
|-------------|

The skb mark restore mode (cpmark):

This mode copies the firewall conntrack mark to the skb's mark field.
It is completely the functional equivalent of the existing act_connmark
action with the additional feature of being able to apply a mask to the
restored value.

Parameters related to skb mark restore mode:

mask - a 32 bit mask applied to the firewall conntrack mark to mask out
bits unwanted for restoration.  This can be useful where the conntrack
mark is being used for different purposes by different applications.  If
not specified and by default the whole mark field is copied (i.e.
default mask of 0xffffffff)

e.g. mask 0x00ffffff to mask out the top 8 bits being used by the
aforementioned DSCP restore mode.

|----0x00----conntrack mark----ffffff---|
| Bits 31-24 |                          |
| DSCP & flag|      some value here     |
|---------------------------------------|
			|
			|
			v
|------------skb mark-------------------|
|            |                          |
|  zeroed    |                          |
|---------------------------------------|

Overall parameters:

zone - conntrack zone

control - action related control (reclassify | pipe | drop | continue |
ok | goto chain <CHAIN_INDEX>)
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

24ec483c

r8169: remove 1000/Half from supported modes · a6851c61

Heiner Kallweit authored May 28, 2019

MAC on the GBit versions supports 1000/Full only, however the PHY
partially claims to support 1000/Half. So let's explicitly remove
this mode.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6851c61

net: mscc: ocelot: Implement port policers via tc command · 2c1d029a

Joergen Andreasen authored May 28, 2019

Hardware offload of matchall classifier and police action are now
supported via the tc command.
Supported police parameters are: rate and burst.

Example:

Add:
tc qdisc add dev eth3 handle ffff: ingress
tc filter add dev eth3 parent ffff: prio 1 handle 2	\
	matchall skip_sw				\
	action police rate 100Mbit burst 10000

Show:
tc -s -d qdisc show dev eth3
tc -s -d filter show dev eth3 ingress

Delete:
tc filter del dev eth3 parent ffff: prio 1
tc qdisc del dev eth3 handle ffff: ingress
Signed-off-by: Joergen Andreasen <joergen.andreasen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2c1d029a

29 May, 2019 35 commits

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 7da33a8f

David S. Miller authored May 29, 2019

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-05-29

This series contains updates to ice driver only.

Bruce cleans up white space issues and fixes complaints about using
bitop assignments using operands of different sizes.

Anirudh cleans up code that is no longer needed now that the firmware
supports the functionality.  Adds support for ethtool selftestto the ice
driver, which includes testing link, interrupts, eeprom, registers and
packet loopback.  Also, cleaned up duplicate code.

Tony implements support for toggling receive VLAN filter via ethtool.

Brett bumps up the minimum receive descriptor count per queue to resolve
dropped packets.  Refactored the interrupt tracking for the ice driver
to resolve issues seen with the co-existence of features and SR-IOV, so
instead of having a hardware IRQ tracker and a software IRQ tracker,
simply use one tracker.  Also adds a helper function to trigger software
interrupts.

Mitch changes how Malicious Driver Detection (MDD) events are handled,
to ensure all VFs checked for MDD events and just log the event instead
of disabling the VF, which was preventing proper release of resources if
the VF is rebooted or the VF driver reloaded.

Dave cleans up a redundant call to register LLDP MIB change events.

Dan adds support to retrieve the current setting of firmware logging
from the hardware to properly initialize the hardware structure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

7da33a8f

net: stmmac: Fix build error without CONFIG_INET · a3e2f6ad

YueHaibing authored May 28, 2019

Fix gcc build error while CONFIG_INET is not set

drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.o: In function `__stmmac_test_loopback':
stmmac_selftests.c:(.text+0x8ec): undefined reference to `ip_send_check'
stmmac_selftests.c:(.text+0xacc): undefined reference to `udp4_hwcsum'

Add CONFIG_INET dependency to fix this.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 091810db ("net: stmmac: Introduce selftests support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a3e2f6ad

rhashtable: Add rht_ptr_rcu and improve rht_ptr · 279758f8

Herbert Xu authored May 28, 2019

This patch moves common code between rht_ptr and rht_ptr_exclusive
into __rht_ptr. It also adds a new helper rht_ptr_rcu exclusively
for the RCU case. This way rht_ptr becomes a lock-only construct
so we can use the lighter rcu_dereference_protected primitive.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

279758f8

net: stmmac: use dev_info() before netdev is registered · af649352

Jisheng Zhang authored May 28, 2019

Before the netdev is registered, calling netdev_info() will emit
something as "(unnamed net device) (uninitialized)", looks confusing.

Before this patch:
[    3.155028] stmmaceth f7b60000.ethernet (unnamed net_device) (uninitialized): device MAC address 52:1a:55:18:9e:9d

After this patch:
[    3.155028] stmmaceth f7b60000.ethernet: device MAC address 52:1a:55:18:9e:9d
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af649352

qed: fix spelling mistake "inculde" -> "include" · 1b3855ab

Colin Ian King authored May 28, 2019

There is a spelling mistake in a DP_INFO message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1b3855ab

ice: Add a helper to trigger software interrupt · e89e899f

Brett Creeley authored Apr 16, 2019

Add a new function ice_trigger_sw_intr to trigger interrupts.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e89e899f

ice: Configure RSS LUT key only if RSS is enabled · 3a9e32bb

Md Fahad Iqbal Polash authored Apr 16, 2019

Call ice_vsi_cfg_rss_lut_key only if RSS is enabled.
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

3a9e32bb

ice: Add ice_get_fw_log_cfg to init FW logging · 11fe1b3a

Dan Nowlin authored Apr 16, 2019

In order to initialize the current status of the FW logging,
this patch adds ice_get_fw_log_cfg. The function retrieves
the current setting of the FW logging from HW and updates the
ice_hw structure accordingly.
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

11fe1b3a

ice: Minor cleanup in ice_switch.h · 1eb11036

Anirudh Venkataramanan authored Apr 16, 2019

Remove duplicate define for ICE_INVAL_Q_HANDLE. Move defines to the
top of the file.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

1eb11036

ice: Remove redundant and premature event config · 91aed40d

Dave Ertman authored Apr 16, 2019

In the path for re-enabling FW LLDP engine, there is
a call to register for LLDP MIB change events.  This
call is redundant, in that the call to ice_pf_dcb_cfg
will already register the driver for these events.  Also,
the call as it stands now is too early in the flow before
before DCB is configured.

Remove the redundant call.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

91aed40d

ice: Change message level · 4cc82aaa

Mitch Williams authored Apr 16, 2019

Change the message level of the MTU change log message from debug to
info.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

4cc82aaa

ice: Check all VFs for MDD activity, don't disable · 23c01122

Mitch Williams authored Apr 16, 2019

Don't use the mdd_detected variable as an exit condition for this loop;
the first VF to NOT have an MDD event will cause the loop to terminate.

Instead just look at all of the VFs, but don't disable them. This
prevents proper release of resources if the VFs are rebooted or the VF
driver reloaded. Instead, just log a message and call out repeat
offenders.

To make it clear what we are doing, use a differently-named variable in
the loop.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

23c01122

ice: Refactor interrupt tracking · cbe66bfe

Brett Creeley authored Apr 16, 2019

Currently we have two MSI-x (IRQ) trackers, one for OS requested MSI-x
entries (sw_irq_tracker) and one for hardware MSI-x vectors
(hw_irq_tracker). Generally the sw_irq_tracker has less entries than the
hw_irq_tracker because the hw_irq_tracker has entries equal to the max
allowed MSI-x per PF and the sw_irq_tracker is mainly the minimum (non
SR-IOV portion of the vectors, kernel granted IRQs). All of the non
SR-IOV portions of the driver (i.e. LAN queues, RDMA queues, OICR, etc.)
take at least one of each type of tracker resource. SR-IOV only grabs
entries from the hw_irq_tracker. There are a few issues with this approach
that can be seen when doing any kind of device reconfiguration (i.e.
ethtool -L, SR-IOV, etc.). One of them being, any time the driver creates
an ice_q_vector and associates it to a LAN queue pair it will grab and
use one entry from the hw_irq_tracker and one from the sw_irq_tracker.
If the indices on these does not match it will cause a Tx timeout, which
will cause a reset and then the indices will match up again and traffic
will resume. The mismatched indices come from the trackers not being the
same size and/or the search_hint in the two trackers not being equal.
Another reason for the refactor is the co-existence of features with
SR-IOV. If SR-IOV is enabled and the interrupts are taken from the end
of the sw_irq_tracker then other features can no longer use this space
because the hardware has now given the remaining interrupts to SR-IOV.

This patch reworks how we track MSI-x vectors by removing the
hw_irq_tracker completely and instead MSI-x resources needed for SR-IOV
are determined all at once instead of per VF. This can be done because
when creating VFs we know how many are wanted and how many MSI-x vectors
each VF needs. This also allows us to start using MSI-x resources from
the end of the PF's allowed MSI-x vectors so we are less likely to use
entries needed for other features (i.e. RDMA, L2 Offload, etc).

This patch also reworks the ice_res_tracker structure by removing the
search_hint and adding a new member - "end". Instead of having a
search_hint we will always search from 0. The new member, "end", will be
used to manipulate the end of the ice_res_tracker (specifically
sw_irq_tracker) during runtime based on MSI-x vectors needed by SR-IOV.
In the normal case, the end of ice_res_tracker will be equal to the
ice_res_tracker's num_entries.

The sriov_base_vector member was added to the PF structure. It is used
to represent the starting MSI-x index of all the needed MSI-x vectors
for all SR-IOV VFs. Depending on how many MSI-x are needed, SR-IOV may
have to take resources from the sw_irq_tracker. This is done by setting
the sw_irq_tracker->end equal to the pf->sriov_base_vector. When all
SR-IOV VFs are removed then the sw_irq_tracker->end is reset back to
sw_irq_tracker->num_entries. The sriov_base_vector, along with the VF's
number of MSI-x (pf->num_vf_msix), vf_id, and the base MSI-x index on
the PF (pf->hw.func_caps.common_cap.msix_vector_first_id), is used to
calculate the first HW absolute MSI-x index for each VF, which is used
to write to the VPINT_ALLOC[_PCI] and GLINT_VECT2FUNC registers to
program the VFs MSI-x PCI configuration bits. Also, the sriov_base_vector
is used along with VF's num_vf_msix, vf_id, and q_vector->v_idx to
determine the MSI-x register index (used for writing to GLINT_DYN_CTL)
within the PF's space.

Interrupt changes removed any references to hw_base_vector, hw_oicr_idx,
and hw_irq_tracker. Only sw_base_vector, sw_oicr_idx, and sw_irq_tracker
variables remain. Change all of these by removing the "sw_" prefix to
help avoid confusion with these variables and their use.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

cbe66bfe

ice: Add handler for ethtool selftest · 0e674aeb

Anirudh Venkataramanan authored Apr 16, 2019

This patch adds a handler for ethtool selftest. Selftest includes
testing link, interrupts, eeprom, registers and packet loopback.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

0e674aeb

ice: Don't call ice_cfg_itr() for SR-IOV · 4b6f3eca

Brett Creeley authored Apr 16, 2019

ice_cfg_itr() sets the ITR granularity and default ITR values for the
PF's interrupt vectors. For VF's this will be done in the AVF driver
flow. Fix this by not calling ice_cfg_itr() for SR-IOV.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

4b6f3eca

ice: Set minimum default Rx descriptor count to 512 · 1aec6e1b

Brett Creeley authored Apr 16, 2019

Currently we set the default number of Rx descriptors per
queue to the system's page size divided by the number of bytes per
descriptor. For 4K page size systems this is resulting in 128 Rx
descriptors per queue. This is causing more dropped packets than desired
in the default configuration. Fix this by setting the minimum default
Rx descriptor count per queue to 512.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

1aec6e1b

ice: Resolve static analysis warning · e65e9e15

Bruce Allan authored Apr 16, 2019

Some static analysis tools can complain when doing a bitop assignment using
operands of different sizes. Fix that.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e65e9e15

ice: Implement toggling ethtool rx-vlan-filter · 3171948e

Tony Nguyen authored Apr 16, 2019

Implement the toggling of rx-vlan-filter; enable|disable VLAN
pruning based on on|off, respectively.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

3171948e

ice: Remove direct write for GLLAN_RCTL_0 · 588d511f

Anirudh Venkataramanan authored Apr 16, 2019

Clear PXE mode AQ call (opcode 0x0110) is now supported in FW. So
remove the direct register write to GLLAN_RCTL_0.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

588d511f

ice: Fix LINE_SPACING style issue · 95f8e8b9

Bruce Allan authored Apr 16, 2019

Fix a checkpatch "LINE_SPACING: Please don't use multiple blank lines"
issue that has snuck in to the code.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

95f8e8b9

Merge branch 'qed-Fix-inifinite-spinning-of-PTP-poll-thread' · 1167187f

David S. Miller authored May 29, 2019

Sudarsana Reddy Kalluru says:

====================
qed*: Fix inifinite spinning of PTP poll thread.

The patch series addresses an error scenario in the PTP Tx implementation.

Please consider applying it to net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1167187f

qede: Handle infinite driver spinning for Tx timestamp. · 9adebac3

Sudarsana Reddy Kalluru authored May 27, 2019

In PTP Tx implementation, driver kept scheduling a poll thread until the
timestamp is available. In the error scenarios (e.g. app requesting the
timestamp for non-ptp packet), this thread kept waiting for the timestamp
forever. This patch add changes to report such scenario as an error and
terminate the thread. Added a timeout of 2 seconds i.e., max time to wait
for Tx timestamp. Added a stat value ptp_skip_txts for reporting the number
of packets for which Tx timestamping is skipped.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9adebac3

qed: Reduce the severity of ptp debug message. · 24c6203b

Sudarsana Reddy Kalluru authored May 27, 2019

PTP Tx implementation continuously polls for the availability of timestamp.
Reducing the severity of a debug message in this path to avoid filling up
the syslog buffer with this message, especially in the error scenarios.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

24c6203b

macvlan: Replace strncpy() by strscpy() · 36f18439

Gustavo A. R. Silva authored May 27, 2019

The strncpy() function is being deprecated. Replace it by the safer
strscpy() and fix the following Coverity warning:

"Calling strncpy with a maximum size argument of 16 bytes on destination
array ifrr.ifr_ifrn.ifrn_name of size 16 bytes might leave the destination
string unterminated."

Notice that, unlike strncpy(), strscpy() always null-terminates the
destination string.

Addresses-Coverity-ID: 1445537 ("Buffer not null terminated")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

36f18439

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · be1b5b78

David S. Miller authored May 28, 2019

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2019-05-28

This series contains updates to e1000e, igb and igc.

Feng adds additional information on a warning message when a read of a
hardware register fails.

Gustavo A. R. Silva fixes up two "fall through" code comments so that
the checkers can actually determine that we did comment that the case
statement is falling through to the next case.

Sasha does some cleanup on the igc driver by removing duplicate
white space and removed a unneeded workaround for igc.  Adds support for
flow control to the igc driver.

Konstantin Khlebnikov reverts a previous fix which was causing a false
positive for a hardware hang.  Provides a fix so that when link is lost
the packets in the transmit queue are flushed and wakes the transmit
queue when the NIC is ready to send packets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

be1b5b78

Merge branch 'net-API-and-initial-implementation-for-nexthop-objects' · c38e57ae

David S. Miller authored May 28, 2019

David Ahern says:

====================
net: API and initial implementation for nexthop objects

This set contains the API and initial implementation for nexthops as
standalone objects.

Patch 1 contains the UAPI and updates to selinux struct.

Patch 2 contains the barebones code for nexthop commands, rbtree
maintenance and notifications.

Patch 3 then adds support for IPv4 gateways along with handling of
netdev events.

Patch 4 adds support for IPv6 gateways.

Patch 5 has the implementation of the encap attributes.

Patch 6 adds support for nexthop groups.

At the end of this set, nexthop objects can be created and deleted and
userspace can monitor nexthop events, but ipv4 and ipv6 routes can not
use them yet. Once the nexthop struct is defined, follow on sets add it
to fib{6}_info and handle it within the respective code before routes
can be inserted using them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c38e57ae

nexthop: Add support for nexthop groups · 430a0491

David Ahern authored May 24, 2019

Allow the creation of nexthop groups which reference other nexthop
objects to create multipath routes:

                      +--------------+
   +------------+   +--------------+ |
   | nh  nh_grp --->| nh_grp_entry |-+
   +------------+   +---------|----+
     ^                |       |    +------------+
     +----------------+       +--->| nh, weight |
        nh_parent                  +------------+

A group entry points to a nexthop with a weight for that hop within the
group. The nexthop has a list_head, grp_list, for tracking which groups
it is a member of and the group entry has a reference back to the parent.
The grp_list is used when a nexthop is deleted - to efficiently remove
it from groups using it.

If a nexthop group spec is given, no other attributes can be set. Each
nexthop id in a group spec must already exist.

Similar to single nexthops, the specification of a nexthop group can be
updated so that data is managed with rcu locking.

Add path selection function to account for multiple paths and add
ipv{4,6}_good_nh helpers to know that if a neighbor entry exists it is
in a good state.

Update NETDEV event handling to rebalance multipath nexthop groups if
a nexthop is deleted due to a link event (down or unregister).

When a nexthop is removed any groups using it are updated. Groups using a
nexthop a tracked via a grp_list.

Nexthop dumps can be limited to groups only by adding NHA_GROUPS to the
request.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

430a0491

nexthop: Add support for lwt encaps · b513bd03

David Ahern authored May 24, 2019

Add support for NHA_ENCAP and NHA_ENCAP_TYPE. Leverages the existing code
for lwtunnel within fib_nh_common, so the only change needed is handling
the attributes in the nexthop code.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b513bd03

nexthop: Add support for IPv6 gateways · 53010f99

David Ahern authored May 24, 2019

Handle IPv6 gateway in a nexthop spec. If nh_family is set to AF_INET6,
NHA_GATEWAY is expected to be an IPv6 address. Add ipv6 option to gw in
nh_config to hold the address, add fib6_nh to nh_info to leverage the
ipv6 initialization and cleanup code. Update nh_fill_node to dump the v6
address.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

53010f99

nexthop: Add support for IPv4 nexthops · 597cfe4f

David Ahern authored May 24, 2019

Add support for IPv4 nexthops. If nh_family is set to AF_INET, then
NHA_GATEWAY is expected to be an IPv4 address.

Register for netdev events to be notified of admin up/down changes as
well as deletes. A hash table is used to track nexthop per devices to
quickly convert device events to the affected nexthops.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

597cfe4f

net: Initial nexthop code · ab84be7e

David Ahern authored May 24, 2019

Barebones start point for nexthops. Implementation for RTM commands,
notifications, management of rbtree for holding nexthops by id, and
kernel side data structures for nexthops and nexthop config.

Nexthops are maintained in an rbtree sorted by id. Similar to routes,
nexthops are configured per namespace using netns_nexthop struct added
to struct net.

Nexthop notifications are sent when a nexthop is added or deleted,
but NOT if the delete is due to a device event or network namespace
teardown (which also involves device events). Applications are
expected to use the device down event to flush nexthops and any
routes used by the nexthops.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ab84be7e

net: nexthop uapi · 65ee00a9

David Ahern authored May 24, 2019

New UAPI for nexthops as standalone objects:
- defines netlink ancillary header, struct nhmsg
- RTM commands for nexthop objects, RTM_*NEXTHOP,
- RTNLGRP for nexthop notifications, RTNLGRP_NEXTHOP,
- Attributes for creating nexthops, NHA_*
- Attribute for route specs to specify a nexthop by id, RTA_NH_ID.

The nexthop attributes and semantics follow the route and RTA ones for
device, gateway and lwt encap. Unique to nexthop objects are a blackhole
and a group which contains references to other nexthop objects. With the
exception of blackhole and group, nexthop objects MUST contain a device.
Gateway and encap are optional. Nexthop groups can only reference other
pre-existing nexthops by id. If the NHA_ID attribute is present that id
is used for the nexthop. If not specified, one is auto assigned.

Dump requests can include attributes:
- NHA_GROUPS to return only nexthop groups,
- NHA_MASTER to limit dumps to nexthops with devices enslaved to the
  given master (e.g., VRF)
- NHA_OIF to limit dumps to nexthops using given device

nlmsg_route_perms in selinux code is updated for the new RTM comands.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

65ee00a9

Merge branch 'hns3-next' · 602e0f29

David S. Miller authored May 28, 2019

Huazhong Tan says:

====================
code optimizations & bugfixes for HNS3 driver

This patch-set includes code optimizations and bugfixes for the HNS3
ethernet controller driver.

[patch 1/12] fixes a compile warning reported by kbuild test robot.

[patch 2/12] fixes HNS3_RXD_GRO_SIZE_M macro definition error.

[patch 3/12] adds a debugfs command to dump firmware information.

[patch 4/12 - 10/12] adds some code optimizaions and cleanups for
reset and driver unloading.

[patch 11/12 - 12/12] adds two bugfixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

602e0f29

net: hns3: fix a memory leak issue for hclge_map_unmap_ring_to_vf_vector · 49f971bd

Huazhong Tan authored May 28, 2019

When hclge_bind_ring_with_vector() fails,
hclge_map_unmap_ring_to_vf_vector() returns the error
directly, so nobody will free the memory allocated by
hclge_get_ring_chain_from_mbx().

So hclge_free_vector_ring_chain() should be called no matter
hclge_bind_ring_with_vector() fails or not.

Fixes: 84e095d6 ("net: hns3: Change PF to add ring-vect binding & resetQ to mailbox")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

49f971bd

net: hns3: adjust hns3_uninit_phy()'s location in the hns3_client_uninit() · 0d2f68c7

Huazhong Tan authored May 28, 2019

hns3_uninit_phy() should be called before checking
HNS3_NIC_STATE_INITED flags, otherwise when this checking fails,
there is nobody to call hns3_uninit_phy().

Fixes: c8a8045b ("net: hns3: Fix NULL deref when unloading driver")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0d2f68c7