Commits · 0c197f16f7bc5ddb43073690a80fb15998ad61e4 · nexedi / linux

24 Apr, 2020 22 commits

mac80211: agg-tx: add an option to defer ADDBA transmit · 0c197f16

Mordechay Goodstein authored Mar 26, 2020

Driver tells mac80211 to sends ADDBA with SSN (starting sequence number)
from the head of the queue, while the transmission of all the frames in the
queue may take a while, which causes the peer to time out. In order to
fix this scenario, add an option to defer ADDBA transmit until queue
is drained.
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.0f27423fec75.If67daab123a27c1cbddef000d6a3f212aa6309ef@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

0c197f16

mac80211: agg-tx: refactor sending addba · 31d8bb4e

Mordechay Goodstein authored Mar 26, 2020

We move the actual arming the timer and sending ADDBA to a function
for the use in different places calling the same logic.
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.58a337eb90a1.I75934e6464535fbf43969acc796bc886291e79a5@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

31d8bb4e

mac80211: Skip entries with HE membership selector · 4826e721

Ilan Peer authored Mar 26, 2020

When parsing supported rates IE.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.ed3e66f8c197.I93aad0e5ddb7ce79f05f8153922acb9aa5076d38@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

4826e721

cfg80211: Parse HE membership selector · 2a392596

Ilan Peer authored Mar 26, 2020

This extends the support for drivers that rebuilds IEs in the
FW (same as with HT/VHT).
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.20feaabfb484.I886252639604c8e3e84b8ef97962f1b0e4beec81@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

2a392596

mac80211: Don't destroy auth data in case of anti-clogging · a4055e74

Andrei Otcheretianski authored Mar 26, 2020

SAE AP may reject authentication with WLAN_STATUS_ANTI_CLOG_REQUIRED.
As the user space will immediately continue the authentication flow,
there is no need to destroy the authentication data in this case.
This saves unneeded station removal and releasing the channel.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.7483996157a8.I8040a842874aaf6d209df3fc8a2acb97a0bf508b@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

a4055e74

mac80211: add twt_protected flag to the bss_conf structure · d46b4ab8

Shaul Triebitz authored Mar 26, 2020

Add a flag to the BSS conf whether the BSS and STA support protected TWT.
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.1dcb2d16fa74.I74d7c007dad2601d2e39f54612fe6554dd5ab386@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

d46b4ab8

mac80211: implement Operating Mode Notification extended NSS support · 9166cc49

Johannes Berg authored Mar 26, 2020

Somehow we missed this for a long time, but similar to the extended
NSS support in VHT capabilities, we need to have this in Operating
Mode notification.

Implement it by
 * parsing the 160/80+80 bit there and setting the bandwidth
   appropriately
 * having callers of ieee80211_get_vht_max_nss() pass in the current
   max NSS value as received in the operating mode notification in
   order to modify it appropriately depending on the extended NSS
   bits.

This updates all drivers that use it, i.e. only iwlwifi/mvm.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.098483728cfa.I4e8c25d3288441759c2793247197229f0696a37d@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

9166cc49

mac80211: Process multicast RX registration for Action frames · 873b1cf6

Jouni Malinen authored Apr 21, 2020

Convert a user space registration for processing multicast Action frames
(NL80211_CMD_REGISTER_FRAME with NL80211_ATTR_RECEIVE_MULTICAST) to a
new enum ieee80211_filter_flags bit FIF_MCAST_ACTION so that drivers can
update their RX filter parameters appropriately, if needed.
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Link: https://lore.kernel.org/r/20200421144815.19175-1-jouni@codeaurora.org
[rename variables to rx_mcast_action_reg indicating action frames only]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

873b1cf6

nl80211: allow client-only BIGTK support · 155d7c73

Johannes Berg authored Apr 20, 2020

The current NL80211_EXT_FEATURE_BEACON_PROTECTION feature flag
requires both AP and client support, add a new one called
NL80211_EXT_FEATURE_BEACON_PROTECTION_CLIENT that enables only
support in client (and P2P-client) modes.

Link: https://lore.kernel.org/r/20200420140559.6ba704053a5a.Ifeb869fb0b48e52fe0cb9c15572b93ac8a924f8d@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

155d7c73

cfg80211: support multicast RX registration · 9dba48a6

Johannes Berg authored Apr 17, 2020

For DPP, there's a need to receive multicast action frames,
but many drivers need a special filter configuration for this.

Support announcing from userspace in the management registration
that multicast RX is required, with an extended feature flag if
the driver handles this.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Link: https://lore.kernel.org/r/20200417124013.c46238801048.Ib041d437ce0bff28a0c6d5dc915f68f1d8591002@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

9dba48a6

cfg80211: change internal management frame registration API · 6cd536fe

Johannes Berg authored Apr 17, 2020

Almost all drivers below cfg80211 get the API wrong (except for
cfg80211) and are unable to cope with multiple registrations for
the same frame type, which is valid due to the match filter.
This seems to indicate the API is wrong, and we should maintain
the full information in cfg80211 instead of the drivers.

Change the API to no longer inform the driver about individual
registrations and unregistrations, but rather every time about
the entire state of the entire wiphy and single wdev, whenever
it may have changed. This also simplifies the code in cfg80211
as it no longer has to track exactly what was unregistered and
can free things immediately.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Link: https://lore.kernel.org/r/20200417124300.f47f3828afc8.I7f81ef59c2c5a340d7075fb3c6d0e08e8aeffe07@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

6cd536fe

mac80211: Report beacon protection failures to user space · 9eaf183a

Jouni Malinen authored Apr 01, 2020

Report received Beacon frames that do not have a valid MME MIC when
beacon protection is enabled. This covers both the cases of no MME in
the received frame and invalid MIC in the MME.
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Link: https://lore.kernel.org/r/20200401142548.6990-2-jouni@codeaurora.orgSigned-off-by: Johannes Berg <johannes.berg@intel.com>

9eaf183a

cfg80211: Unprotected Beacon frame RX indication · 4d797fce

Jouni Malinen authored Apr 01, 2020

Extend cfg80211_rx_unprot_mlme_mgmt() to cover indication of unprotected
Beacon frames in addition to the previously used Deauthentication and
Disassociation frames. The Beacon frame case is quite similar, but has
couple of exceptions: this is used both with fully unprotected and also
incorrectly protected frames and there is a rate limit on the events to
avoid unnecessary flooding netlink events in case something goes wrong.
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Link: https://lore.kernel.org/r/20200401142548.6990-1-jouni@codeaurora.org
[add missing kernel-doc]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

4d797fce

mac80211: fix drv_config_iface_filter() behaviour · 90e8f58d

Johannes Berg authored Apr 17, 2020

There are two bugs with this, first, it shouldn't be called
on an interface that's down, and secondly, it should then be
called when the interface comes up.

Note that the currently only user (iwlwifi) doesn't seem to
care about either of these scenarios.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200417111830.401d82c7a0bf.I5dc7d718816460c2d8d89c7af6c215f9e2b3078f@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

90e8f58d

mac80211: mlme: remove duplicate AID bookkeeping · 1db364c8

Johannes Berg authored Apr 17, 2020

Maintain the connection AID only in sdata->vif.bss_conf.aid, not
also in sdata->u.mgd.aid.

Keep setting that where we set ifmgd->aid before, which has the
side effect of exposing the AID to the driver before the station
entry (AP) is marked associated, in case it needs it then.
Requested-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/20200417123802.085d4a322b0c.I2e7a2ceceea8c6880219f9e9ee4d4ac985fd295a@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

1db364c8

mac80211_hwsim: notify wmediumd of used MAC addresses · 5cc58a9e

Johannes Berg authored Mar 23, 2020

Currently, wmediumd requires each used MAC address to be configured
as a station in the virtual air, but that doesn't make sense as any
station could have multiple MAC addresses, and even have randomized
ones in scanning, etc.

Add some code here to tell wmediumd of used MAC addresses, binding
them to the hardware address. Combined with a wmediumd patch that
makes it track the addresses this allows configuring just the radio
address (42:00:00:00:nn:00 unless the radio was manually created)
in wmediumd as a station, and all addresses that the station uses
are added/removed dynamically.

Tested with random scan, which without this and the corresponding
wmediumd change doesn't get anything through as the sender doesn't
exist as far as wmediumd is concerned (it's random).

Link: https://lore.kernel.org/r/20200323162358.b397b1a1acef.Ice0536e34e5d96c51f97c374ea8af9551347c7e8@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

5cc58a9e

Merge branch 'ovs-meter-tables' · 18021360

David S. Miller authored Apr 23, 2020

Tonghao Zhang says:

====================
openvswitch: expand meter tables and fix bug

The patch set expand or shrink the meter table when necessary.
and other patches fix bug or improve codes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

18021360

net: openvswitch: use u64 for meter bucket · e5735887

Tonghao Zhang authored Apr 24, 2020

When setting the meter rate to 4+Gbps, there is an
overflow, the meters don't work as expected.

Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e5735887

net: openvswitch: make EINVAL return value more obvious · c7735008

Tonghao Zhang authored Apr 24, 2020

Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

c7735008

net: openvswitch: remove the unnecessary check · a8e38738

Tonghao Zhang authored Apr 24, 2020

Before invoking the ovs_meter_cmd_reply_stats, "meter"
was checked, so don't check it agin in that function.

Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

a8e38738

net: openvswitch: set max limitation to meters · eb58eebc

Tonghao Zhang authored Apr 24, 2020

Don't allow user to create meter unlimitedly, which may cause
to consume a large amount of kernel memory. The max number
supported is decided by physical memory and 20K meters as default.

Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

eb58eebc

net: openvswitch: expand the meters supported number · c7c4c44c

Tonghao Zhang authored Apr 24, 2020

In kernel datapath of Open vSwitch, there are only 1024
buckets of meter in one datapath. If installing more than
1024 (e.g. 8192) meters, it may lead to the performance drop.
But in some case, for example, Open vSwitch used as edge
gateway, there should be 20K at least, where meters used for
IP address bandwidth limitation.

[Open vSwitch userspace datapath has this issue too.]

For more scalable meter, this patch use meter array instead of
hash tables, and expand/shrink the array when necessary. So we
can install more meters than before in the datapath.
Introducing the struct *dp_meter_instance, it's easy to
expand meter though changing the *ti point in the struct
*dp_meter_table.

Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Andy Zhou <azhou@ovn.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

c7c4c44c

23 Apr, 2020 18 commits

net: phy: bcm54140: fix less than zero comparison on an unsigned · efcd549d

Colin Ian King authored Apr 23, 2020

Currently the unsigned variable tmp is being checked for an negative
error return from the call to bcm_phy_read_rdb and this can never
be true since tmp is unsigned.  Fix this by making tmp a plain int.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 4406d36d ("net: phy: bcm54140: add hwmon support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

efcd549d

qed: Make ll2_cbs static · 8ffe2df6

Zou Wei authored Apr 23, 2020

Fix the following sparse warning:

drivers/net/ethernet/qlogic/qed/qed_ll2.c:2334:20: warning: symbol 'll2_cbs'
was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8ffe2df6

net: sched : Remove unnecessary cast in kfree · 3c9143d9

Xu Wang authored Apr 23, 2020

Remove unnecassary casts in the argument to kfree.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

3c9143d9

Merge branch 'net-ethernet-ti-cpts-add-irq-and-HW_TS_PUSH-events' · 92a8da46

David S. Miller authored Apr 23, 2020

Grygorii Strashko says:

====================
net: ethernet: ti: cpts: add irq and HW_TS_PUSH events

This is re-spin of patches to add CPSW IRQ and HW_TS_PUSH events support I've
sent long time ago [1]. In this series, I've tried to restructure and split changes,
and also add few additional optimizations comparing to initial RFC submission [1].

The HW_TS_PUSH events intended to serve for different timesync purposes on of
which is to add PPS generation function, which can be implemented as below:

                     +-----------------+
                     | Control         |
                     | application     |
            +------->+                 +----------+
            |        |                 |          |
            |        |                 |          |
            |        +-----------------+          |
            |                                     |
            |                                     |
            | PTP_EXTTS_REQUEST                   |
            |                                     |
            |                                     |
 +----------------------------------------------------------------+
            |                                     |    Kernel
    +-------+----------+                  +-------v--------+
    |  \dev\ptpX       |                  | /sys/class/pwm/|
    |                  |                  |                |
    +-------^----------+                  +-------+--------+
            |                                     |
            |                                     |
            |                             +-------v-------------------+
    +-------+----------+                  |                           |
    | CPTS driver      |                  |pwm/pwm-omap-dmtimer.c     |
    |                  |                  +---------------------------+
    +-------^----------+                  |clocksource/timer_ti_dm.c  |
            |                             +-------+-------------------+
            |HWx_TS_PUSH evt                      |
 +----------------------------------------------------------------+
            |                                     |         HW
    +-------+----------+                  +-------v--------+
    | CPTS             |                  | DMTimer        |
    |                  |                  |                |
    |      HWx_TS_PUSH X<-----------------+                |
    |                  +                  |                |
    +------------------+                  +-------+--------+
                                                  |
                                                  X timer4

As per my knowledge there is at least one public implemented above PPS generation
schema from Tusori Tibor [2] based on initial HW_TS_PUSH enable submission[1].
And now there is work done by Lokesh Vutla <lokeshvutla@ti.com> published to
enable PWM enable/improve PWM adjustment from user space [3][4][5].

Main changes comparing to initial submission:
- TX timestamp processing deferred to ptp worker only
- both CPTS IRQ and polling events processing supported to make it work for
  Keystone 2 also
- switch to use new .gettimex64() interface
- no DT updates as number of HWx_TS_PUSH inputs is static per HW

Testing on am571x-idk/omap2plus_defconfig/+CONFIG_PREEMPT=y:
1) testing HW_TS_PUSH
 - enable pwm in DT
	pwm16: dmtimer-pwm {
		compatible = "ti,omap-dmtimer-pwm";
		ti,timers = <&timer16>;
		#pwm-cells = <3>;
	};
 - configure and start pwm
  echo 0 > /sys/class/pwm/pwmchip0/export
  echo 1000000000 > /sys/class/pwm/pwmchip0/pwm0/period
  echo 500000000 > /sys/class/pwm/pwmchip0/pwm0/duty_cycle
  echo 1 > /sys/class/pwm/pwmchip0/pwm0/enable
 - test HWx_TS_PUSH using Kernel selftest testptp application
  ./tools/testing/selftests/ptp/testptp -d /dev/ptp0 -e 1000 -i 3

2) testing phc2sys
phc2sys[1616.791]: eth0 rms 408190379792180864 max 1580914543017209856 freq +864 +/- 4635 delay 645 +/- 29
phc2sys[1646.795]: eth0 rms 41 max 108 freq +0 +/- 36 delay 656 +/- 29
phc2sys[1676.800]: eth0 rms 43 max 83 freq +2 +/- 38 delay 650 +/- 0
phc2sys[1706.804]: eth0 rms 39 max 87 freq +4 +/- 34 delay 672 +/- 55
phc2sys[1736.808]: eth0 rms 35 max 66 freq +1 +/- 30 delay 667 +/- 49
phc2sys[1766.813]: eth0 rms 38 max 79 freq +2 +/- 33 delay 656 +/- 29
phc2sys[1796.817]: eth0 rms 45 max 98 freq +1 +/- 39 delay 656 +/- 29
phc2sys[1826.821]: eth0 rms 40 max 87 freq +5 +/- 35 delay 650 +/- 0
phc2sys[1856.826]: eth0 rms 29 max 76 freq -0 +/- 25 delay 656 +/- 29
phc2sys[1886.830]: eth0 rms 40 max 97 freq +4 +/- 35 delay 667 +/- 49
phc2sys[1916.834]: eth0 rms 42 max 94 freq +2 +/- 36 delay 661 +/- 41
phc2sys[1946.839]: eth0 rms 40 max 91 freq +2 +/- 35 delay 661 +/- 41
phc2sys[1976.843]: eth0 rms 46 max 88 freq -0 +/- 40 delay 667 +/- 49
phc2sys[2006.847]: eth0 rms 49 max 97 freq +2 +/- 43 delay 650 +/- 0

3) testing ptp4l
- 1G connection
ptp4l[862.891]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[923.894]: rms 1019697354682 max 5768279314068 freq +26053 +/- 72 delay 488 +/- 1
ptp4l[987.896]: rms 13 max 26 freq +26005 +/- 29 delay 488 +/- 1
ptp4l[1051.899]: rms 14 max 50 freq +25895 +/- 21 delay 488 +/- 1
ptp4l[1115.901]: rms 11 max 27 freq +25878 +/- 17 delay 488 +/- 1
ptp4l[1179.904]: rms 10 max 27 freq +25857 +/- 12 delay 488 +/- 1
ptp4l[1243.906]: rms 14 max 37 freq +25851 +/- 15 delay 488 +/- 1
ptp4l[1307.909]: rms 12 max 33 freq +25835 +/- 15 delay 488 +/- 1
ptp4l[1371.911]: rms 11 max 27 freq +25832 +/- 14 delay 488 +/- 1
ptp4l[1435.914]: rms 11 max 26 freq +25823 +/- 11 delay 488 +/- 1
ptp4l[1499.916]: rms 10 max 29 freq +25829 +/- 11 delay 489 +/- 1
ptp4l[1563.919]: rms 11 max 27 freq +25827 +/- 12 delay 488 +/- 1

- 10M connection
ptp4l[51.955]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[112.957]: rms 279468848453933920 max 1580914542977391360 freq +25390 +/- 3207 delay 8222 +/- 36
ptp4l[176.960]: rms 254 max 522 freq +25809 +/- 219 delay 8271 +/- 30
ptp4l[240.962]: rms 271 max 684 freq +25868 +/- 234 delay 8249 +/- 22
ptp4l[304.965]: rms 263 max 556 freq +25894 +/- 227 delay 8225 +/- 47
ptp4l[368.967]: rms 238 max 648 freq +25908 +/- 204 delay 8234 +/- 40
ptp4l[432.970]: rms 274 max 658 freq +25932 +/- 237 delay 8241 +/- 22
ptp4l[496.972]: rms 247 max 557 freq +25943 +/- 213 delay 8223 +/- 26
ptp4l[560.974]: rms 291 max 756 freq +25968 +/- 251 delay 8244 +/- 41
ptp4l[624.977]: rms 249 max 697 freq +25975 +/- 216 delay 8258 +/- 22

Changes in v5:
 - fixed build issue

Changes in v4:
 - fixed comments from Richard Cochran
 - dropped patch "net: ethernet: ti: cpts: move rx timestamp processing to ptp
   worker only"
 - added "Acked-by" from Richard Cochran <richardcochran@gmail.com>
 - dependencies resolved, patch merged

Changes in v3:
 - fixed rebase mess
 - fixed build issues

Changes in v2 (broken):
 - fixed (formatting) comments from David Miller <davem@davemloft.net>

v4: https://patchwork.ozlabs.org/project/netdev/cover/20200422201254.15232-1-grygorii.strashko@ti.com/
v3: https://patchwork.ozlabs.org/project/netdev/cover/20200320194244.4703-1-grygorii.strashko@ti.com/
v2: https://patchwork.ozlabs.org/cover/1258339/
v1: https://patchwork.ozlabs.org/cover/1254708/

[1] https://lore.kernel.org/patchwork/cover/799251/
[2] https://usermanual.wiki/Document/SetupGuide.632280828.pdf
    https://github.com/t-tibor/msc_thesis
[3] https://patchwork.kernel.org/cover/11421329/
[4] https://patchwork.kernel.org/cover/11433197/
[5] https://sourceforge.net/p/linuxptp/mailman/message/36943248/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

92a8da46

net: ethernet: ti: cpsw: enable cpts irq · 84ea9c0a

Grygorii Strashko authored Apr 23, 2020

The CPSW misc IRQ need be enabled for CPTS event_pend IRQs processing. This
patch adds corresponding support to CPSW driver.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

84ea9c0a

net: ethernet: ti: cpts: add support for HW_TS_PUSH events · b78aba49

Grygorii Strashko authored Apr 23, 2020

Hence CPTS IRQ support is in place the W_TS_PUSH events can be added.
PWM capable DmTimers can be used to generete input signals for CPTS on TI
AM335x/AM437x/DRA7 SoCs to be timestamped:
AM335x/AM437x: timer4 - timer7
DRA7/AM57xx: timer13 - timer16
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b78aba49

net: ethernet: ti: cpts: add irq support · 85624412

Grygorii Strashko authored Apr 23, 2020

Add CPTS IRQ support, but do not enable it. By default, the CPTS driver
will continue working using polling mode which is required for CPTS to
continue working on platforms other than CPSW, like Keystone 2.

The CPTS IRQ support is required to enable support for HW_TS_PUSH events.
The CPSW CPTS IRQ and HW_TS_PUSH events support will be enabled in follow
up patches.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85624412

net: ethernet: ti: cpts: rework locking · ba107428

Grygorii Strashko authored Apr 23, 2020

Now spinlock is used to synchronize everything which is not required. Add
mutex and use to sync access to PTP interface and PTP worker and use
spinlock only to sync FIFO/events processing.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ba107428

net: ethernet: ti: cpts: move tx timestamp processing to ptp worker only · c8f8e47e

Grygorii Strashko authored Apr 23, 2020

Now the tx timestamp processing happens from different contexts - softirq
and thread/PTP worker. Enabling IRQ will add one more hard_irq context.
This makes over all defered TX timestamp processing and locking
overcomplicated. Move tx timestamp processing to PTP worker always instead.

napi_rx->cpts_tx_timestamp
 if ptp_packet then
    push to txq
    ptp_schedule_worker()

do_aux_work->cpts_overflow_check
 cpts_process_events()
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c8f8e47e

net: ethernet: ti: cpts: optimize packet to event matching · 3bfd41b5

Grygorii Strashko authored Apr 23, 2020

Now the CPTS driver performs packet (skb) parsing every time when it needs
to match packet to CPTS event (including ptp_classify_raw() calls).

This patch optimizes matching process by parsing packet only once upon
arrival and stores PTP specific data in skb->cb using the same fromat as in
CPTS HW event. As result, all future matching reduces to comparing two u32
values.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3bfd41b5

net: ethernet: ti: cpts: switch to use new .gettimex64() interface · 856e59ab

Grygorii Strashko authored Apr 23, 2020

The CPTS HW latches and saves CPTS counter value in CPTS fifo immediately
after writing to CPSW_CPTS_PUSH.TS_PUSH (bit 0), so the total time that the
driver needs to read the CPTS timestamp is the time required CPSW_CPTS_PUSH
write to actually reach HW.

Hence switch CPTS driver to implement new .gettimex64() callback for more
precise measurement of the offset between a PHC and the system clock which
is measured as time between
  write(CPSW_CPTS_PUSH)
  read(CPSW_CPTS_PUSH)
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

856e59ab

net: ethernet: ti: cpts: move tc mult update in cpts_fifo_read() · 0d6df3e6

Grygorii Strashko authored Apr 23, 2020

Now CPTS driver .adjfreq() generates request to read CPTS current time
(CPTS_EV_PUSH) with intention to process all pending event using previous
frequency adjustment values before switching to the new ones. So
CPTS_EV_PUSH works as a marker to switch to the new frequency adjustment
values. Current code assumes that all job is done in .adjfreq(), but after
enabling IRQ this will not be true any more.

Hence save new frequency adjustment values (mult) and perform actual freq
adjustment in cpts_fifo_read() immediately after CPTS_EV_PUSH is received.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0d6df3e6

net: ethernet: ti: cpts: separate hw counter read from timecounter · e66dccce

Grygorii Strashko authored Apr 23, 2020

Now CPTS HW time reading code is implemented in timecounter->cyclecounter
.read() callback and performs following operations:
timecounter_read() ->cc.read() -> cpts_systim_read()
 - request current CPTS HW time CPTS_TS_PUSH.TS_PUSH = 1
 - poll CPTS FIFO for CPTS_EV_PUSH event with current HW timestamp

This approach need to be changed for the future switch to PTP PHC
.gettimex64() callback, which require to separate requesting current CPTS
HW time and processing CPTS FIFO. And for the follow up patch, which
improves .adjfreq() implementation.

This patch moves code accessing CPTS HW out of timecounter code as
following:
- convert HW timestamp of every CPTS event to PTP time (us) and store it as
part struct cpts_event;
- add CPTS context field to store current CPTS HW time (counter) value and
update it on CPTS_EV_PUSH reception;
- move code accessing CPTS HW out of timecounter code and use current CPTS
HW time (counter) from CPTS context instead;
- ensure timecounter->cycle_last is updated on CPTS_EV_PUSH reception.

After this change CPTS timecounter will only perform timekeeper role
without actually accessing CPTS HW.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e66dccce

net: ethernet: ti: cpts: use dev_yy() api for logs · 79d6e755

Grygorii Strashko authored Apr 23, 2020

Use dev_yy() API instead of pr_yy() for log outputs.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

79d6e755

Merge branch 'net-napi-addition-of-napi_defer_hard_irqs' · 4c532b14

David S. Miller authored Apr 23, 2020

Eric Dumazet says:

====================
net: napi: addition of napi_defer_hard_irqs

This patch series augments gro_glush_timeout feature with napi_defer_hard_irqs

As extensively described in first patch changelog, this can suppresss
the chit-chat traffic between NIC and host to signal interrupts and re-arming
them, since this can be an issue on high speed NIC with many queues.

The last patch in this series converts mlx4 TX completion to
napi_complete_done(), to enable this new mechanism.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4c532b14

net/mlx4_en: use napi_complete_done() in TX completion · cf4058db

Eric Dumazet authored Apr 22, 2020

In order to benefit from the new napi_defer_hard_irqs feature,
we need to use napi_complete_done() variant in this driver.

RX path is already using it, this patch implements TX completion side.

mlx4_en_process_tx_cq() now returns the amount of retired packets,
instead of a boolean, so that mlx4_en_poll_tx_cq() can pass
this value to napi_complete_done().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cf4058db

net: napi: use READ_ONCE()/WRITE_ONCE() · 7e417a66

Eric Dumazet authored Apr 22, 2020

gro_flush_timeout and napi_defer_hard_irqs can be read
from napi_complete_done() while other cpus write the value,
whithout explicit synchronization.

Use READ_ONCE()/WRITE_ONCE() to annotate the races.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e417a66

net: napi: add hard irqs deferral feature · 6f8b12d6

Eric Dumazet authored Apr 22, 2020

Back in commit 3b47d303 ("net: gro: add a per device gro flush timer")
we added the ability to arm one high resolution timer, that we used
to keep not-complete packets in GRO engine a bit longer, hoping that further
frames might be added to them.

Since then, we added the napi_complete_done() interface, and commit
364b6055 ("net: busy-poll: return busypolling status to drivers")
allowed drivers to avoid re-arming NIC interrupts if we made a promise
that their NAPI poll() handler would be called in the near future.

This infrastructure can be leveraged, thanks to a new device parameter,
which allows to arm the napi hrtimer, instead of re-arming the device
hard IRQ.

We have noticed that on some servers with 32 RX queues or more, the chit-chat
between the NIC and the host caused by IRQ delivery and re-arming could hurt
throughput by ~20% on 100Gbit NIC.

In contrast, hrtimers are using local (percpu) resources and might have lower
cost.

The new tunable, named napi_defer_hard_irqs, is placed in the same hierarchy
than gro_flush_timeout (/sys/class/net/ethX/)

By default, both gro_flush_timeout and napi_defer_hard_irqs are zero.

This patch does not change the prior behavior of gro_flush_timeout
if used alone : NIC hard irqs should be rearmed as before.

One concrete usage can be :

echo 20000 >/sys/class/net/eth1/gro_flush_timeout
echo 10 >/sys/class/net/eth1/napi_defer_hard_irqs

If at least one packet is retired, then we will reset napi counter
to 10 (napi_defer_hard_irqs), ensuring at least 10 periodic scans
of the queue.

On busy queues, this should avoid NIC hard IRQ, while before this patch IRQ
avoidance was only possible if napi->poll() was exhausting its budget
and not call napi_complete_done().

This feature also can be used to work around some non-optimal NIC irq
coalescing strategies.

Having the ability to insert XX usec delays between each napi->poll()
can increase cache efficiency, since we increase batch sizes.

It also keeps serving cpus not idle too long, reducing tail latencies.
Co-developed-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6f8b12d6