- 23 Nov, 2019 5 commits
-
-
Jakub Kicinski authored
Rahul Lakkireddy says: ==================== This series of patches add UDP Segmentation Offload (USO) supported by Chelsio T5/T6 NICs. Patch 1 updates the current Scatter Gather List (SGL) DMA unmap logic for USO requests. Patch 2 adds USO support for NIC and MQPRIO QoS offload Tx path. Patch 3 adds missing stats for MQPRIO QoS offload Tx path. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Rahul Lakkireddy authored
Export necessary stats for traffic flowing through MQPRIO QoS offload Tx path. v2: - No change. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Rahul Lakkireddy authored
Implement and export UDP segmentation offload (USO) support for both NIC and MQPRIO QoS offload Tx path. Update appropriate logic in Tx to parse GSO info in skb and configure FW_ETH_TX_EO_WR request needed to perform USO. v2: - Remove inline keyword from write_eo_udp_wr() in sge.c. Let the compiler decide. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Rahul Lakkireddy authored
The FW_ETH_TX_EO_WR used for sending UDP Segmentation Offload (USO) requests expects the headers to be part of the descriptor and the payload to be part of the SGL containing the DMA mapped addresses. Hence, the DMA address in the first entry of the SGL can start after the packet headers. Currently, unmap_sgl() tries to unmap from this wrong offset, instead of the originally mapped DMA address. So, use existing unmap_skb() instead, which takes originally saved DMA addresses as input. Update all necessary Tx paths to save the original DMA addresses, so that unmap_skb() can unmap them properly. v2: - No change. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock from commit c8183f54 ("s390/qeth: fix potential deadlock on workqueue flush"), removed the code which was removed by commit 9897d583 ("s390/qeth: consolidate some duplicated HW cmd code"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
- 22 Nov, 2019 35 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Validate tunnel options length in act_tunnel_key, from Xin Long. 2) Fix DMA sync bug in gve driver, from Adi Suresh. 3) TSO kills performance on some r8169 chips due to HW issues, disable by default in that case, from Corinna Vinschen. 4) Fix clock disable mismatch in fec driver, from Chubong Yuan. 5) Fix interrupt status bits define in hns3 driver, from Huazhong Tan. 6) Fix workqueue deadlocks in qeth driver, from Julian Wiedmann. 7) Don't napi_disable() twice in r8152 driver, from Hayes Wang. 8) Fix SKB extension memory leak, from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) r8152: avoid to call napi_disable twice MAINTAINERS: Add myself as maintainer of virtio-vsock udp: drop skb extensions before marking skb stateless net: rtnetlink: prevent underflows in do_setvfinfo() can: m_can_platform: remove unnecessary m_can_class_resume() call can: m_can_platform: set net_device structure as driver data hv_netvsc: Fix send_table offset in case of a host bug hv_netvsc: Fix offset usage in netvsc_send_table() net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN sfc: Only cancel the PPS workqueue if it exists nfc: port100: handle command failure cleanly net-sysfs: fix netdev_queue_add_kobject() breakage r8152: Re-order napi_disable in rtl8152_close net: qca_spi: Move reset_count to struct qcaspi net: qca_spi: fix receive buffer size check net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode Revert "net/ibmvnic: Fix EOI when running in XIVE mode" net/mlxfw: Verify FSM error code translation doesn't exceed array size net/mlx5: Update the list of the PCI supported devices net/mlx5: Fix auto group size calculation ...
-
Marc Dionne authored
By default s_maxbytes is set to MAX_NON_LFS, which limits the usable file size to 2GB, enforced by the vfs. Commit b9b1f8d5 ("AFS: write support fixes") added support for the 64-bit fetch and store server operations, but did not change this value. As a result, attempts to write past the 2G mark result in EFBIG errors: $ dd if=/dev/zero of=foo bs=1M count=1 seek=2048 dd: error writing 'foo': File too large Set s_maxbytes to MAX_LFS_FILESIZE. Fixes: b9b1f8d5 ("AFS: write support fixes") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Marc Dionne authored
Servers sending callback breaks to the YFS_CM_SERVICE service may send up to YFSCBMAX (1024) fids in a single RPC. Anything over AFSCBMAX (50) will cause the assert in afs_break_callbacks to trigger. Remove the assert, as the count has already been checked against the appropriate max values in afs_deliver_cb_callback and afs_deliver_yfs_cb_callback. Fixes: 35dbfba3 ("afs: Implement the YFS cache manager service") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Chen Wandun authored
Fix following sparse warnings: drivers/net/dsa/ocelot/felix.c:351:6: warning: symbol 'felix_txtstamp' was not declared. Should it be static? Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hayes Wang authored
Call napi_disable() twice would cause dead lock. There are three situations may result in the issue. 1. rtl8152_pre_reset() and set_carrier() are run at the same time. 2. Call rtl8152_set_tunable() after rtl8152_close(). 3. Call rtl8152_set_ringparam() after rtl8152_close(). For #1, use the same solution as commit 84811412 ("r8152: Re-order napi_disable in rtl8152_close"). For #2 and #3, add checking the flag of IFF_UP and using napi_disable/napi_enable during mutex. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton: "Three fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/ksm.c: don't WARN if page is still mapped in remove_stable_node() mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span() Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
-
Andrea Mayer authored
End.DT6 behavior makes use of seg6_lookup_nexthop() function which drops all packets that are destined to be locally processed. However, DT* should be able to deliver decapsulated packets that are destined to local addresses. Function seg6_lookup_nexthop() is also used by DX6, so in order to maintain compatibility I created another routing helper function which is called seg6_lookup_any_nexthop(). This function is able to take into account both packets that have to be processed locally and the ones that are destined to be forwarded directly to another machine. Hence, seg6_lookup_any_nexthop() is used in DT6 rather than seg6_lookup_nexthop() to allow local delivery. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
In commit a82055af ("netfilter: nft_payload: add VLAN offload support"), VLAN fields in struct flow_dissector_key_vlan were unionized with the intention of introducing another field that covered the whole TCI header. However without a wrapping struct the subfields end up sharing the same bits. As a result, "tc filter add ... flower vlan_id 14" specifies not only vlan_id, but also vlan_priority. Fix by wrapping the individual VLAN fields in a struct. Fixes: a82055af ("netfilter: nft_payload: add VLAN offload support") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'linux-can-fixes-for-5.4-20191122' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2019-11-22 this is a pull request of 2 patches for net/master, if possible for the current release cycle. Otherwise these patches should hit v5.4 via the stable tree. Both patches of this pull request target the m_can driver. Pankaj Sharma fixes the fallout in the m_can_platform part, which appeared with the introduction of the m_can platform framework. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'mac80211-next-for-net-next-2019-11-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== The interesting new thing here is AQL, the Airtime Queue Limit patchset from Kan Yan (Google) and Toke Høiland-Jørgensen (Redhat). The effect is intended to eventually be similar to BQL, but byte queue limits are not useful in wifi where the actual throughput can vary by around 4 orders of magnitude. There are more details in the patches themselves. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefano Garzarella authored
Since I'm actively working on vsock and virtio/vhost transports, Stefan suggested to help him to maintain it. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tuong Lien authored
It is observed that TIPC service binding order will not be kept in the publication event report to user if the service is subscribed after the bindings. For example, services are bound by application in the following order: Server: bound port A to {18888,66,66} scope 2 Server: bound port A to {18888,33,33} scope 2 Now, if a client subscribes to the service range (e.g. {18888, 0-100}), it will get the 'TIPC_PUBLISHED' events in that binding order only when the subscription is started before the bindings. Otherwise, if started after the bindings, the events will arrive in the opposite order: Client: received event for published {18888,33,33} Client: received event for published {18888,66,66} For the latter case, it is clear that the bindings have existed in the name table already, so when reported, the events' order will follow the order of the rbtree binding nodes (- a node with lesser 'lower'/'upper' range value will be first). This is correct as we provide the tracking on a specific service status (available or not), not the relationship between multiple services. However, some users expect to see the same order of arriving events irrespective of when the subscription is issued. This turns out to be easy to fix. We now add functionality to ensure that publication events always are issued in the same temporal order as the corresponding bindings were performed. v2: replace the unnecessary macro - 'publication_after()' with inline function. v3: reuse 'time_after32()' instead of reinventing the same exact code. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hoang Le authored
When setting up a cluster with non-replicast/replicast capability supported. This capability will be disabled for broadcast send link in order to be backwards compatible. However, when these non-support nodes left and be removed out the cluster. We don't update this capability on broadcast send link. Then, some of features that based on this capability will also disabling as unexpected. In this commit, we make sure the broadcast send link capabilities will be re-calculated as soon as a node removed/rejoined a cluster. Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Westphal authored
Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free assumes all skb head state has been dropped already. This will leak the extension memory in case the skb has extensions other than the ipsec secpath, e.g. bridge nf data. To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have extensions or if the extension space can be free'd. Fixes: 895b5c9f ("netfilter: drop bridge nf reset from nf_reset") Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: Byron Stanoszek <gandalf@winds.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
The "ivm->vf" variable is a u32, but the problem is that a number of drivers cast it to an int and then forget to check for negatives. An example of this is in the cxgb4 driver. drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 2890 static int cxgb4_mgmt_get_vf_config(struct net_device *dev, 2891 int vf, struct ifla_vf_info *ivi) ^^^^^^ 2892 { 2893 struct port_info *pi = netdev_priv(dev); 2894 struct adapter *adap = pi->adapter; 2895 struct vf_info *vfinfo; 2896 2897 if (vf >= adap->num_vfs) ^^^^^^^^^^^^^^^^^^^ 2898 return -EINVAL; 2899 vfinfo = &adap->vfinfo[vf]; ^^^^^^^^^^^^^^^^^^^^^^^^^^ There are 48 functions affected. drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8435 hclge_set_vf_vlan_filter() warn: can 'vfid' underflow 's32min-2147483646' drivers/net/ethernet/freescale/enetc/enetc_pf.c:377 enetc_pf_set_vf_mac() warn: can 'vf' underflow 's32min-2147483646' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2899 cxgb4_mgmt_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2960 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3019 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3038 cxgb4_mgmt_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3086 cxgb4_mgmt_set_vf_link_state() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb/cxgb2.c:791 get_eeprom() warn: can 'i' underflow 's32min-(-4),0,4-s32max' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:82 bnxt_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:164 bnxt_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:186 bnxt_get_vf_config() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:228 bnxt_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:264 bnxt_set_vf_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:293 bnxt_set_vf_bw() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:333 bnxt_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2281 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2285 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2286 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2292 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2297 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1832 qlcnic_sriov_set_vf_mac() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1864 qlcnic_sriov_set_vf_tx_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1937 qlcnic_sriov_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2005 qlcnic_sriov_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2036 qlcnic_sriov_set_vf_spoofchk() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/emulex/benet/be_main.c:1914 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1915 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1922 be_set_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1951 be_clear_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2063 be_set_vf_tx_rate() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2091 be_set_vf_link_state() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:2609 ice_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3050 ice_get_vf_cfg() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3103 ice_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3181 ice_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3237 ice_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3286 ice_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3919 i40e_validate_vf() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3957 i40e_ndo_set_vf_mac() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4104 i40e_ndo_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4263 i40e_ndo_set_vf_bw() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4309 i40e_ndo_get_vf_config() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4371 i40e_ndo_set_vf_link_state() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4504 i40e_ndo_set_vf_trust() warn: can 'vf_id' underflow 's32min-2147483646' Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management regression fix from Rafael Wysocki: "Fix problems with switching cpufreq drivers on some x86 systems with ACPI (and with changing the operation modes of the intel_pstate driver on those systems) introduced by recent changes related to the management of frequency limits in cpufreq" * tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: QoS: Invalidate frequency QoS requests after removal
-
git://anongit.freedesktop.org/drm/drmLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Two sets of fixes in here, one for amdgpu, and one for i915. The amdgpu ones are pretty small, i915's CI system seems to have a few problems in the last week or so, there is one major regression fix for fb_mmap, but there are a bunch of other issues fixed in there as well, oops, screen flashes and rcu related. amdgpu: - Remove experimental flag for navi14 - Fix confusing power message failures on older VI parts - Hang fix for gfxoff when using the read register interface - Two stability regression fixes for Raven i915: - Fix kernel oops on dumb_create ioctl on no crtc situation - Fix bad ugly colored flash on VLV/CHV related to gamma LUT update - Fix unity of the frequencies reported on PMU - Fix kernel oops on set_page_dirty using better locks around it - Protect the request pointer with RCU to prevent it being freed while we might need still - Make pool objects read-only - Restore physical addresses for fb_map to avoid corrupted page table" * tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm: drm/i915/fbdev: Restore physical addresses for fb_mmap() Revert "drm/amd/display: enable S/G for RAVEN chip" drm/amdgpu: disable gfxoff on original raven drm/amdgpu: disable gfxoff when using register read interface drm/amd/powerplay: correct fine grained dpm force level setting drm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs drm/amdgpu: remove experimental flag for Navi14 drm/i915: make pool objects read-only drm/i915: Protect request peeking with RCU drm/i915/userptr: Try to acquire the page lock around set_page_dirty() drm/i915/pmu: "Frequency" is reported as accumulated cycles drm/i915: Preload LUTs if the hw isn't currently using them drm/i915: Don't oops in dumb_create ioctl if we have no crtcs
-
Andrey Ryabinin authored
It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in remove_stable_node() when it races with __mmput() and squeezes in between ksm_exit() and exit_mmap(). WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150 Call Trace: remove_all_stable_nodes+0x12b/0x330 run_store+0x4ef/0x7b0 kernfs_fop_write+0x200/0x420 vfs_write+0x154/0x450 ksys_write+0xf9/0x1d0 do_syscall_64+0x99/0x510 entry_SYSCALL_64_after_hwframe+0x49/0xbe Remove the warning as there is nothing scary going on. Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com Fixes: cbf86cfe ("ksm: remove old stable nodes more thoroughly") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Hildenbrand authored
Let's limit shrinking to !ZONE_DEVICE so we can fix the current code. We should never try to touch the memmap of offline sections where we could have uninitialized memmaps and could trigger BUGs when calling page_to_nid() on poisoned pages. There is no reliable way to distinguish an uninitialized memmap from an initialized memmap that belongs to ZONE_DEVICE, as we don't have anything like SECTION_IS_ONLINE we can use similar to pfn_to_online_section() for !ZONE_DEVICE memory. E.g., set_zone_contiguous() similarly relies on pfn_to_online_section() and will therefore never set a ZONE_DEVICE zone consecutive. Stopping to shrink the ZONE_DEVICE therefore results in no observable changes, besides /proc/zoneinfo indicating different boundaries - something we can totally live with. Before commit d0dc12e8 ("mm/memory_hotplug: optimize memory hotplug"), the memmap was initialized with 0 and the node with the right value. So the zone might be wrong but not garbage. After that commit, both the zone and the node will be garbage when touching uninitialized memmaps. Toshiki reported a BUG (race between delayed initialization of ZONE_DEVICE memmaps without holding the memory hotplug lock and concurrent zone shrinking). https://lkml.org/lkml/2019/11/14/1040 "Iteration of create and destroy namespace causes the panic as below: kernel BUG at mm/page_alloc.c:535! CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0 Call Trace: memmap_init_zone_device+0x165/0x17c memremap_pages+0x4c1/0x540 devm_memremap_pages+0x1d/0x60 pmem_attach_disk+0x16b/0x600 [nd_pmem] nvdimm_bus_probe+0x69/0x1c0 really_probe+0x1c2/0x3e0 driver_probe_device+0xb4/0x100 device_driver_attach+0x4f/0x60 bind_store+0xc9/0x110 kernfs_fop_write+0x116/0x190 vfs_write+0xa5/0x1a0 ksys_write+0x59/0xd0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 While creating a namespace and initializing memmap, if you destroy the namespace and shrink the zone, it will initialize the memmap outside the zone and trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page) in set_pfnblock_flags_mask()." This BUG is also mitigated by this commit, where we for now stop to shrink the ZONE_DEVICE zone until we can do it in a safe and clean way. Link: http://lkml.kernel.org/r/20191006085646.5768-5-david@redhat.com Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e8] Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Damian Tometzki <damian.tometzki@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jun Yao <yaojun8558363@gmail.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Pankaj Gupta <pagupta@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qian Cai <cai@lca.pw> Cc: Rich Felker <dalias@libc.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> [4.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joseph Qi authored
This reverts commit 56e94ea1. Commit 56e94ea1 ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()") introduces a regression that fail to create directory with mount option user_xattr and acl. Actually the reported NULL pointer dereference case can be correctly handled by loc->xl_ops->xlo_add_entry(), so revert it. Link: http://lkml.kernel.org/r/1573624916-83825-1-git-send-email-joseph.qi@linux.alibaba.com Fixes: 56e94ea1 ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Thomas Voegtle <tv@lio96.de> Acked-by: Changwei Ge <gechangwei@live.cn> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Pankaj Sharma authored
The function m_can_runtime_resume() is getting recursively called from m_can_class_resume(). This results in a lock up. We need not call m_can_class_resume() during m_can_runtime_resume(). Fixes: f524f829 ("can: m_can: Create a m_can platform framework") Signed-off-by: Pankaj Sharma <pankj.sharma@samsung.com> Signed-off-by: Sriram Dash <sriram.dash@samsung.com> Acked-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Pankaj Sharma authored
The current code is failing during clock prepare enable because of not getting proper clock from platform device. [ 0.852089] Call trace: [ 0.854516] 0xffff0000fa22a668 [ 0.857638] clk_prepare+0x20/0x34 [ 0.861019] m_can_runtime_resume+0x2c/0xe4 [ 0.865180] pm_generic_runtime_resume+0x28/0x38 [ 0.869770] __rpm_callback+0x16c/0x1bc [ 0.873583] rpm_callback+0x24/0x78 [ 0.877050] rpm_resume+0x428/0x560 [ 0.880517] __pm_runtime_resume+0x7c/0xa8 [ 0.884593] m_can_clk_start.isra.9.part.10+0x1c/0xa8 [ 0.889618] m_can_class_register+0x138/0x370 [ 0.893950] m_can_plat_probe+0x120/0x170 [ 0.897939] platform_drv_probe+0x4c/0xa0 [ 0.901924] really_probe+0xd8/0x31c [ 0.905477] driver_probe_device+0x58/0xe8 [ 0.909551] device_driver_attach+0x68/0x70 [ 0.913711] __driver_attach+0x9c/0xf8 [ 0.917437] bus_for_each_dev+0x50/0xa0 [ 0.921251] driver_attach+0x20/0x28 [ 0.924804] bus_add_driver+0x148/0x1fc [ 0.928617] driver_register+0x6c/0x124 [ 0.932431] __platform_driver_register+0x48/0x50 [ 0.937113] m_can_plat_driver_init+0x18/0x20 [ 0.941446] do_one_initcall+0x4c/0x19c [ 0.945259] kernel_init_freeable+0x1d0/0x280 [ 0.949591] kernel_init+0x10/0x100 [ 0.953057] ret_from_fork+0x10/0x18 [ 0.956614] Code: 00000000 00000000 00000000 00000000 (fa22a668) [ 0.962681] ---[ end trace 881f71bd609de763 ]--- [ 0.967301] Kernel panic - not syncing: Attempted to kill init! A device driver for CAN controller hardware registers itself with the Linux network layer as a network device. So, the driver data for m_can should ideally be of type net_device. Fixes: f524f829 ("can: m_can: Create a m_can platform framework") Signed-off-by: Pankaj Sharma <pankj.sharma@samsung.com> Signed-off-by: Sriram Dash <sriram.dash@samsung.com> Acked-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Toke Høiland-Jørgensen authored
The previous commit added the ability to throttle stations when they queue too much airtime in the hardware. This commit enables the functionality by calculating the expected airtime usage of each packet that is dequeued from the TXQs in mac80211, and accounting that as pending airtime. The estimated airtime for each skb is stored in the tx_info, so we can subtract the same amount from the running total when the skb is freed or recycled. The throttling mechanism relies on this accounting to be accurate (i.e., that we are not freeing skbs without subtracting any airtime they were accounted for), so we put the subtraction into ieee80211_report_used_skb(). As an optimisation, we also subtract the airtime on regular TX completion, zeroing out the value stored in the packet afterwards, to avoid having to do an expensive lookup of the station from the packet data on every packet. This patch does *not* include any mechanism to wake a throttled TXQ again, on the assumption that this will happen anyway as a side effect of whatever freed the skb (most commonly a TX completion). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20191119060610.76681-5-kyan@google.comSigned-off-by: Johannes Berg <johannes.berg@intel.com>
-
Kan Yan authored
In order for the Fq_CoDel algorithm integrated in mac80211 layer to operate effectively to control excessive queueing latency, the CoDel algorithm requires an accurate measure of how long packets stays in the queue, AKA sojourn time. The sojourn time measured at the mac80211 layer doesn't include queueing latency in the lower layer (firmware/hardware) and CoDel expects lower layer to have a short queue. However, most 802.11ac chipsets offload tasks such TX aggregation to firmware or hardware, thus have a deep lower layer queue. Without a mechanism to control the lower layer queue size, packets only stay in mac80211 layer transiently before being sent to firmware queue. As a result, the sojourn time measured by CoDel in the mac80211 layer is almost always lower than the CoDel latency target, hence CoDel does little to control the latency, even when the lower layer queue causes excessive latency. The Byte Queue Limits (BQL) mechanism is commonly used to address the similar issue with wired network interface. However, this method cannot be applied directly to the wireless network interface. "Bytes" is not a suitable measure of queue depth in the wireless network, as the data rate can vary dramatically from station to station in the same network, from a few Mbps to over Gbps. This patch implements an Airtime-based Queue Limit (AQL) to make CoDel work effectively with wireless drivers that utilized firmware/hardware offloading. AQL allows each txq to release just enough packets to the lower layer to form 1-2 large aggregations to keep hardware fully utilized and retains the rest of the frames in mac80211 layer to be controlled by the CoDel algorithm. Signed-off-by: Kan Yan <kyan@google.com> [ Toke: Keep API to set pending airtime internal, fix nits in commit msg ] Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20191119060610.76681-4-kyan@google.comSigned-off-by: Johannes Berg <johannes.berg@intel.com>
-
Toke Høiland-Jørgensen authored
Felix recently added code to calculate airtime of packets to the mt76 driver. Import this into mac80211 so we can use it for airtime queue limit calculations. The airtime.c file is copied verbatim from the mt76 driver, and adjusted to be usable in mac80211. This involves: - Switching to mac80211 data structures. - Adding support for 160 MHz channels and HE mode. - Moving the symbol and duration calculations around a bit to avoid rounding with the higher rates and longer symbol times used for HE rates. The per-rate TX rate calculation is also split out to its own function so it can be used directly for the AQL calculations later. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20191119060610.76681-3-kyan@google.com [fix HE_GROUP_IDX() to use 3 * bw, since there are 3 _gi values] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
Taehee Yoo authored
When virt_wifi interface is created, virt_wifi_newlink() is called and it calls register_netdevice(). if register_netdevice() fails, it internally would call ->priv_destructor(), which is virt_wifi_net_device_destructor() and it frees netdev. but virt_wifi_newlink() still use netdev. So, use-after-free would occur in virt_wifi_newlink(). Test commands: ip link add dummy0 type dummy modprobe bonding ip link add bonding_masters link dummy0 type virt_wifi Splat looks like: [ 202.220554] BUG: KASAN: use-after-free in virt_wifi_newlink+0x88b/0x9a0 [virt_wifi] [ 202.221659] Read of size 8 at addr ffff888061629cb8 by task ip/852 [ 202.222896] CPU: 1 PID: 852 Comm: ip Not tainted 5.4.0-rc5 #3 [ 202.223765] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 202.225073] Call Trace: [ 202.225532] dump_stack+0x7c/0xbb [ 202.226869] print_address_description.constprop.5+0x1be/0x360 [ 202.229362] __kasan_report+0x12a/0x16f [ 202.230714] kasan_report+0xe/0x20 [ 202.232595] virt_wifi_newlink+0x88b/0x9a0 [virt_wifi] [ 202.233370] __rtnl_newlink+0xb9f/0x11b0 [ 202.244909] rtnl_newlink+0x65/0x90 [ ... ] Cc: stable@vger.kernel.org Fixes: c7cdba31 ("mac80211-next: rtnetlink wifi simulation device") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20191121122645.9355-1-ap420073@gmail.com [trim stack dump a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
Thomas Pedersen authored
Commit 7b6ddeaf ("mac80211: use QoS NDP for AP probing") let STAs send QoS Null frames as PS triggers if the AP was a QoS STA. However, the mac80211 PS stack relies on an interface flag IEEE80211_STA_NULLFUNC_ACKED for determining trigger frame ACK, which was not being set for acked non-QoS Null frames. The effect is an inability to trigger hardware sleep via IEEE80211_CONF_PS since the QoS Null frame was seemingly never acked. This bug only applies to drivers which set both IEEE80211_HW_REPORTS_TX_ACK_STATUS and IEEE80211_HW_PS_NULLFUNC_STACK. Detect the acked QoS Null frame to restore STA power save. Fixes: 7b6ddeaf ("mac80211: use QoS NDP for AP probing") Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20191119053538.25979-4-thomas@adapt-ip.comSigned-off-by: Johannes Berg <johannes.berg@intel.com>
-
Thomas Pedersen authored
This is useful during testing to eg. check the currently configured HW power save state. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20191119053538.25979-3-thomas@adapt-ip.comSigned-off-by: Johannes Berg <johannes.berg@intel.com>
-
Toke Høiland-Jørgensen authored
In ieee80211_tx_status() we don't have an sdata struct when looking up the destination sta. Instead, we just do a lookup by the vif addr that is the source of the packet being completed. Factor this out into a new sta_info getter helper, since we need to use it for accounting AQL as well. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20191112130835.382062-1-toke@redhat.com [remove internal rcu_read_lock(), document instead] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
Johannes Berg authored
Add a note with a use-case for the monitor-to-dev injection mechanism in mac80211, reported by Ben Greear. Change-Id: I6456997ef9bc40b24ede860b6ef2fed5af49cf44 Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
David S. Miller authored
Haiyang Zhang says: ==================== hv_netvsc: Fix send indirection table offset Fix send indirection table offset issues related to guest and host bugs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haiyang Zhang authored
If negotiated NVSP version <= NVSP_PROTOCOL_VERSION_6, the offset may be wrong (too small) due to a host bug. This can cause missing the end of the send indirection table, and add multiple zero entries from leading zeros before the data region. This bug adds extra burden on channel 0. So fix the offset by computing it from the data structure sizes. This will ensure netvsc driver runs normally on unfixed hosts, and future fixed hosts. Fixes: 5b54dac8 ("hyperv: Add support for virtual Receive Side Scaling (vRSS)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haiyang Zhang authored
To reach the data region, the existing code adds offset in struct nvsp_5_send_indirect_table on the beginning of this struct. But the offset should be based on the beginning of its container, struct nvsp_message. This bug causes the first table entry missing, and adds an extra zero from the zero pad after the data region. This can put extra burden on the channel 0. So, correct the offset usage. Also add a boundary check to ensure not reading beyond data region. Fixes: 5b54dac8 ("hyperv: Add support for virtual Receive Side Scaling (vRSS)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mao Wenan authored
While using ARCH=mips CROSS_COMPILE=mips-linux-gnu- command to compile, make C=2 drivers/net/ethernet/freescale/enetc/enetc.o one warning can be found: drivers/net/ethernet/freescale/enetc/enetc.c:1439:5: warning: symbol 'enetc_setup_tc_mqprio' was not declared. Should it be static? This patch make symbol enetc_setup_tc_mqprio static. Fixes: 34c6adf1 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload") Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maciej Żenczykowski authored
NET_RAW is less dangerous, so more likely to be available to a process, so check it first to prevent some spurious logging. This matches IP_TRANSPARENT which checks NET_RAW first. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-