- 26 Mar, 2021 40 commits
-
-
David S. Miller authored
Taehee Yoo says: ==================== mld: change context from atomic to sleepable This patchset changes the context of MLD module. Before this patchset, MLD functions are atomic context so it couldn't use sleepable functions and flags. There are several reasons why MLD functions are under atomic context. 1. It uses timer API. Timer expiration functions are executed in the atomic context. 2. atomic locks MLD functions use rwlock and spinlock to protect their own resources. So, in order to switch context, this patchset converts resources to use RCU and removes atomic locks and timer API. 1. The first patch convert from the timer API to delayed work. Timer API is used for delaying some works. MLD protocol has a delay mechanism, which is used for replying to a query. If a listener receives a query from a router, it should send a response after some delay. But because of timer expire function is executed in the atomic context, this patch convert from timer API to the delayed work. 2. The fourth patch deletes inet6_dev->mc_lock. The mc_lock has protected inet6_dev->mc_tomb pointer. But this pointer is already protected by RTNL and it isn't be used by datapath. So, it isn't be needed and because of this, many atomic context critical sections are deleted. 3. The fifth patch convert ip6_sf_socklist to RCU. ip6_sf_socklist has been protected by ipv6_mc_socklist->sflock(rwlock). But this is already protected by RTNL So if it is converted to use RCU in order to be used in the datapath, the sflock is no more needed. So, its control path context can be switched to sleepable. 4. The sixth patch convert ip6_sf_list to RCU. The reason for this patch is the same as the previous patch. 5. The seventh patch convert ifmcaddr6 to RCU. The reason for this patch is the same as the previous patch. 6. Add new workqueues for processing query/report event. By this patch, query and report events are processed by workqueue So context is sleepable, not atomic. While this logic, it acquires RTNL. 7. Add new mc_lock. The purpose of this lock is to protect per-interface mld data. Per-interface mld data is usually used by query/report event handler. So, query/report event workers need only this lock instead of RTNL. Therefore, it could reduce bottleneck. Changelog: v2 -> v3: 1. Do not use msecs_to_jiffies(). (by Cong Wang) 2. Do not add unnecessary rtnl_lock() and rtnl_unlock(). (by Cong Wang) 3. Fix sparse warnings because of rcu annotation. (by kernel test robot) - Remove some rcu_assign_pointer(), which was used for non-rcu pointer. - Add union for rcu pointer. - Use rcu API in mld_clear_zeros(). - Remove remained rcu_read_unlock(). - Use rcu API for tomb resources. 4. withdraw prevopus 2nd and 3rd patch. - "separate two flags from ifmcaddr6->mca_flags" - "add a new delayed_work, mc_delrec_work" 5. Add 6th and 7th patch. v1 -> v2: 1. Withdraw unnecessary refactoring patches. (by Cong Wang, Eric Dumazet, David Ahern) a) convert from array to list. b) function rename. 2. Separate big one patch into small several patches. 3. Do not rename 'ifmcaddr6->mca_lock'. In the v1 patch, this variable was changed to 'ifmcaddr6->mca_work_lock'. But this is actually not needed. 4. Do not use atomic_t for 'ifmcaddr6->mca_sfcount' and 'ipv6_mc_socklist'->sf_count'. 5. Do not add mld_check_leave_group() function. 6. Do not add ip6_mc_del_src_bulk() function. 7. Do not add ip6_mc_add_src_bulk() function. 8. Do not use rcu_read_lock() in the qeth_l3_add_mcast_rtnl(). (by Julian Wiedmann) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
The purpose of this lock is to avoid a bottleneck in the query/report event handler logic. By previous patches, almost all mld data is protected by RTNL. So, the query and report event handler, which is data path logic acquires RTNL too. Therefore if a lot of query and report events are received, it uses RTNL for a long time. So it makes the control-plane bottleneck because of using RTNL. In order to avoid this bottleneck, mc_lock is added. mc_lock protect only per-interface mld data and per-interface mld data is used in the query/report event handler logic. So, no longer rtnl_lock is needed in the query/report event handler logic. Therefore bottleneck will be disappeared by mc_lock. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
When query/report packets are received, mld module processes them. But they are processed under BH context so it couldn't use sleepable functions. So, in order to switch context, the two workqueues are added which processes query and report event. In the struct inet6_dev, mc_{query | report}_queue are added so it is per-interface queue. And mc_{query | report}_work are workqueue structure. When the query or report event is received, skb is queued to proper queue and worker function is scheduled immediately. Workqueues and queues are protected by spinlock, which is mc_{query | report}_lock, and worker functions are protected by RTNL. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
The ifmcaddr6 has been protected by inet6_dev->lock(rwlock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ifmcaddr6 actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
The ip6_sf_list has been protected by mca_lock(spin_lock) so that the critical section is atomic context. In order to switch this context, changing locking is needed. The ip6_sf_list actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. But It doesn't remove mca_lock yet because ifmcaddr6 isn't converted to RCU yet. So, It's not fully converted to the sleepable context. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
The sflist has been protected by rwlock so that the critical section is atomic context. In order to switch this context, changing locking is needed. The sflist actually already protected by RTNL So if it's converted to use RCU, its control path context can be switched to sleepable. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
The purpose of mc_lock is to protect inet6_dev->mc_tomb. But mc_tomb is already protected by RTNL and all functions, which manipulate mc_tomb are called under RTNL. So, mc_lock is not needed. Furthermore, it is spinlock so the critical section is atomic. In order to reduce atomic context, it should be removed. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
mcast.c has several timers for delaying works. Timer's expire handler is working under atomic context so it can't use sleepable things such as GFP_KERNEL, mutex, etc. In order to use sleepable APIs, it converts from timers to delayed work. But there are some critical sections, which is used by both process and BH context. So that it still uses spin_lock_bh() and rwlock. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jakub Kicinski says: ==================== ethtool: fec: ioctl kdoc touch ups A few touch ups from v1 review. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
kdoc does not have good support for documenting defines, and we can't abuse the enum documentation because it generates warnings. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Dan points out we need to use the mask not the bit (which is 0). Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 42ce127d ("ethtool: fec: sanitize ethtool_fecparam->fec") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
struct ethtool_fecparam::reserved can't be used in SET, because ethtool user space doesn't zero-initialize the structure. Make this clear. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Mat Martineau says: ==================== MPTCP: Cleanup and address advertisement fixes This patch series contains cleanup and fixes we have been testing in the MPTCP tree. MPTCP uses TCP option headers to advertise additional address information after an initial connection is established. The main fixes here deal with making those advertisements more reliable and improving the way subflows are created after an advertisement is received. Patches 1, 2, 4, 10, and 12 are for various cleanup or refactoring. Patch 3 skips an extra connection attempt if there's already a subflow connection for the newly received advertisement. Patches 5, 6, and 7 make sure that the next address is advertised when there are multiple addresses to share, the advertisement has been retried, and the peer has not echoed the advertisement. Self tests are updated. Patches 8 and 9 fix a problem similar to 5/6/7, but covers a case where the failure was due to a subflow connection not completing. Patches 11 and 13 send a bare ack to revoke an advertisement rather than waiting for other activity to trigger a packet send. This mirrors the way acks are sent for new advertisements. Self test is included. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
This patch adds testcases for signalling multi valid and invalid addresses for both signal_address_tests and remove_tests. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
Since mptcp_pm_nl_add_addr_send_ack is now used for both ADD_ADDR and RM_ADDR cases, rename it to mptcp_pm_nl_addr_send_ack. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
This patch changes the sending ACK conditions for the ADD_ADDR, send an ACK packet for RM_ADDR too. In mptcp_pm_remove_addr, invoke mptcp_pm_nl_add_addr_send_ack to send the ACK packet. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
msk->pm.addr_signal is cleared in mptcp_pm_add_addr_signal, no need to clear it in mptcp_pm_nl_add_addr_send_ack again. Drop it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
When an invalid address was announced, the subflow couldn't be created for this address. Therefore mptcp_pm_nl_subflow_established couldn't be invoked. Then the next addresses in the local address list didn't have a chance to be announced. This patch invokes the new function mptcp_pm_add_addr_echoed when the address is echoed. In it, use mptcp_lookup_anno_list_by_saddr to check whether this address is in the anno_list. If it is, PM schedules the status MPTCP_PM_SUBFLOW_ESTABLISHED to invoke mptcp_pm_create_subflow_or_signal_addr to deal with the next address in the local address list. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
This patch exported the static function lookup_anno_list_by_saddr, and renamed it to mptcp_lookup_anno_list_by_saddr. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
This patch added the timeout testcases for multi addresses, valid and invalid. These testcases need to transmit 8 ADD_ADDRs, so add a new speed level 'least' to set 10 to mptcp_connect to slow down the transmitting process. The original speed level 'slow' still uses 50. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
In some testcases, we need to slow down the transmitting process. This patch added a new argument named cfg_do_w for cfg_remove to allow the caller to pass an argument to cfg_remove. In do_rnd_write, use this cfg_do_w to control the transmitting speed. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
This patch called mptcp_pm_subflow_established to move to the next address when an ADD_ADDR has been retransmitted the maximum number of times. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
This patch drops the unused parameter subflow in mptcp_pm_subflow_established(). Fixes: 926bdeab ("mptcp: Implement path manager interface commands") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
This patch added a new helper named lookup_subflow_by_daddr to find whether the destination address is in the msk's conn_list. In mptcp_pm_nl_add_addr_received, use lookup_subflow_by_daddr to check whether the announced address is already connected. If it is, skip connecting this address and send out the echo. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
Drop the redundant argument 'port' from mptcp_pm_announce_addr, use the port field of another argument 'addr' instead. Fixes: 0f5c9e3f ("mptcp: add port parameter for mptcp_pm_announce_addr") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
After the previous patch we can easily avoid invoking the workqueue to perform the retransmission, if the msk socket lock is held at rtx timer expiration. This also simplifies the relevant code. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Alex Elder says: ==================== net: ipa: rework resource programming This series reworks the way IPA resources are defined and programmed. It is a little long--and I apologize for that--but I think the patches are best taken together as a single unit. The IPA hardware operates with a set of distinct "resources." Each hardware instance has a fixed number of each resource type available. Available resources are divided into smaller pools, with each pool shared by endpoints in a "resource group." Each endpoint is thus assigned to a resource group that determines which pools supply resources the IPA hardware uses to handle the endpoint's processing. The exact set of resources used can differ for each version of IPA. Except for IPA v3.0 and v3.1, there are 5 source and 2 destination resource types, but there's no reason to assume this won't change. The number of resource groups used *does* typically change based on the hardware version. For example, some versions target reduced functionality and support fewer resource groups. With that as background... The net result of this series is to improve the flexibility with which IPA resources and resource groups are defined, permitting each version of IPA to define its own set of resources and groups. Along the way it isolates the resource-related code, and fixes a few bugs related to resource handling. The first patch moves resource-related code to a new C file (and header). It generates a checkpatch warning about updating MAINTAINERS, which can be ignored. The second patch fixes a bug, but the bug does not affect SDM845 or SC7180. The third patch defines an enumerated type whose members provide symbolic names for resource groups. The fourth defines some resource limits for SDM845 that were not previously being programmed. That platform "works" without this, but to be correct, these limits should really be programmed. The fifth patch uses a single enumerated type to define both source and destination resource type IDs, and the sixth uses those IDs to index the resource limit arrays. The seventh moves the definition of that enumerated type into the platform data files, allowing each platform to define its own set of resource types. The eighth and ninth are fairly trivial changes. One replaces two "max" symbols having the same value with a single symbol. And the other replaces two distinct but otherwise identical structure types with a single common one. The 10th is a small preparatory patch for the 11th, passing a different argument to a function that programs resource values. The 11th allows the actual number of source and destination resource groups for a platform to be specified in its configuration data. That way the number is based on the actual number of groups defined. This removes the need for a sort of clunky pair of functions that defined that information previously. Finally, the last patch just increases the number of resource groups that can be defined to 8. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
IPA versions 3.0 and 3.1 support up to 8 resource groups. There is some interest in supporting these older versions of the hardware, so update the resource configuration code to program resource limits for these groups if specified. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
The arrays of source and destination resource limits defined in configuration data are of a fixed size--which is the maximum number of resource groups supported for any platform. Most platforms will use fewer than that many groups. Add new members to the ipa_rsrc_group_id enumerated type to define the number of source and destination resource groups are defined for the platform. (This type is defined for each platform in its data file.) Add a new field to the resource configuration data that indicates how many of the source and destination resource groups are actually used for the platform, and initialize it with the count value. This allows us to determine the number of groups defined for the platform without exposing the ipa_rsrc_group_id enumerated type. As a result, we no longer need ipa_resource_group_src_count() and ipa_resource_group_dst_count(), because each platform now defines its supported number of resource groups. So get rid of those two functions. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Pass the resource data pointer to ipa_resource_config_src() and ipa_resource_config_dst() to be used for configuring resource limits. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
The ipa_resource_src and ipa_resource_dst structures are identical in form, so just replace them with a single structure. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Replace IPA_RESOURCE_GROUP_SRC_MAX and IPA_RESOURCE_GROUP_DST_MAX with a single symbol, IPA_RESOURCE_GROUP_MAX. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Most platforms have the same set of source and destination resource types. But some older platforms have some additional ones, and it's possible different resources will be used in the future. Move the definition of the ipa_resource_type enumerated type so it is defined for each platform in its configuration data file. This permits each to have a distinct set of resources. Shorten the data files slightly, by putting the min and max limit values on the same line. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Remove the type field from the ipa_resource_src and ipa_resource_dst structures, and instead use that value as the index into the arrays of source and destination resources. Change ipa_resource_config_src() and ipa_resource_config_dst() so the resource type is passed in as an argument. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Combine the ipa_resource_type_src and ipa_resource_type_dst enumerated types into a single enumerated type, ipa_resource_type. Assign value 0 to the first element for the source and destination types, so their numeric values are preserved. Add some additional commentary where these are defined, stating explicitly that code assumes the first source and first destination member must have numeric value 0. Fix the kerneldoc comments for the ipa_gsi_endpoint_data structure. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Currently, the SDM845 configuration data defines resource limits for the first two resource groups (for both source and destination resource types). The hardware supports additional resource groups, and we should program the resource limits for those groups as well. Even the "unused" destination resource group (number 2) should have non-zero limits programmed in some cases, to ensure correct operation. Add these missing resource group limit definitions to the SDM845 configuration data. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Define a new ipa_resource_group_id enumerated type, whose members have numeric values that match the resource group number used when programming the hardware. Each platform supports a different number of source and destination resource groups, so define the type separately for each platform in its configuration data file. Use these new symbolic values when specifying the resource group an endpoint is associated with. And use them to index the limits arrays for source and destination resources, making it clearer how these values are used. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
If the number of resource groups supported by the hardware is less than a certain number, we return early in ipa_resource_config_src() and ipa_resource_config_dst() (to avoid programming resource limits for non-existent groups). Unfortunately, these checks are off by one. Fix this problem in the four places it occurs. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Separate the IPA resource-related code into a new source file, "ipa_resource.c", and matching header file "ipa_resource.h". Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Huazhong Tan says: ==================== net: hns3: add some cleanups This series includes some cleanups for the HNS3 ethernet driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-