- 11 Jul, 2019 36 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2019-07-11 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. For -stable v4.15 ('net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn') For -stable v5.1 ('net/mlx5e: Fix port tunnel GRE entropy control') ('net/mlx5e: Rx, Fix checksum calculation for new hardware') ('net/mlx5e: Fix return value from timeout recover function') ('net/mlx5e: Fix error flow in tx reporter diagnose') For -stable v5.2 ('net/mlx5: E-Switch, Fix default encap mode') Conflict note: This pull request will produce a small conflict when merged with net-next. In drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c Take the hunk from net and replace: esw_offloads_steering_init(esw, vf_nvports, total_nvports); with: esw_offloads_steering_init(esw); ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Saeed Mahameed says: ==================== Mellanox, mlx5 build fixes I know net-next is closed but these patches are fixing some compiler build and warnings issues people have been complaining about. I hope it is not too late, but in case it is a lot of trouble for you, I guess they can wait. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
Fix the following compiler warning: In function ‘esw_vport_add_ingress_acl_modify_metadata’: the frame size of 1084 bytes is larger than 1024 bytes [-Wframe-larger-than=] Since the structure is never written to, we can statically allocate it to avoid the stack usage. Fixes: 7445cfb1 ("net/mlx5: E-Switch, Tag packet with vport number in VF vports and uplink ingress ACLs") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
In mlx5e_setup_tc "priv" variable is not being used if CONFIG_MLX5_ESWITCH is off, one way to fix this is to actually use it. mlx5e_setup_tc_mqprio also needs the "priv" variable and it extracts it on its own. We can simply pass priv to mlx5e_setup_tc_mqprio instead of netdev and avoid extracting the priv var, which will also resolve the compiler warning. Fixes: 4e95bc26 ("net: flow_offload: add flow_block_cb_setup_simple()") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> CC: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
In the cited patch below, the Kconfig flags combination of: CONFIG_MLX5_FPGA is not set CONFIG_MLX5_TLS=y CONFIG_MLX5_EN_TLS=y leads to the compilation error: ./include/linux/mlx5/device.h:61:39: error: invalid application of sizeof to incomplete type struct mlx5_ifc_tls_flow_bits. Fix it. Fixes: 90687e1a9a50 ("net/mlx5: Kconfig, Better organize compilation flags") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> CC: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
fl_create() should call static_branch_deferred_inc() only in case of success. Also we should not call fl_free() in error path, as this could cause a static key imbalance. jump label: negative count! WARNING: CPU: 0 PID: 15907 at kernel/jump_label.c:221 static_key_slow_try_dec kernel/jump_label.c:221 [inline] WARNING: CPU: 0 PID: 15907 at kernel/jump_label.c:221 static_key_slow_try_dec+0x1ab/0x1d0 kernel/jump_label.c:206 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 15907 Comm: syz-executor.2 Not tainted 5.2.0-rc6+ #62 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x744 kernel/panic.c:219 __warn.cold+0x20/0x4d kernel/panic.c:576 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986 RIP: 0010:static_key_slow_try_dec kernel/jump_label.c:221 [inline] RIP: 0010:static_key_slow_try_dec+0x1ab/0x1d0 kernel/jump_label.c:206 Code: c0 e8 e9 3e e5 ff 83 fb 01 0f 85 32 ff ff ff e8 5b 3d e5 ff 45 31 ff eb a0 e8 51 3d e5 ff 48 c7 c7 40 99 92 87 e8 13 75 b7 ff <0f> 0b eb 8b 4c 89 e7 e8 a9 c0 1e 00 e9 de fe ff ff e8 bf 6d b7 ff RSP: 0018:ffff88805f9c7450 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 0000000000000000 RDX: 000000000000e3e1 RSI: ffffffff815adb06 RDI: ffffed100bf38e7c RBP: ffff88805f9c74e0 R08: ffff88806acf0700 R09: ffffed1015d060a9 R10: ffffed1015d060a8 R11: ffff8880ae830547 R12: ffffffff89832ce0 R13: ffff88805f9c74b8 R14: 1ffff1100bf38e8b R15: 00000000ffffff01 __static_key_slow_dec_deferred+0x65/0x110 kernel/jump_label.c:272 fl_free+0xa9/0xe0 net/ipv6/ip6_flowlabel.c:121 fl_create+0x6af/0x9f0 net/ipv6/ip6_flowlabel.c:457 ipv6_flowlabel_opt+0x80e/0x2730 net/ipv6/ip6_flowlabel.c:624 do_ipv6_setsockopt.isra.0+0x2119/0x4100 net/ipv6/ipv6_sockglue.c:825 ipv6_setsockopt+0xf6/0x170 net/ipv6/ipv6_sockglue.c:944 tcp_setsockopt net/ipv4/tcp.c:3131 [inline] tcp_setsockopt+0x8f/0xe0 net/ipv4/tcp.c:3125 sock_common_setsockopt+0x94/0xd0 net/core/sock.c:3130 __sys_setsockopt+0x253/0x4b0 net/socket.c:2080 __do_sys_setsockopt net/socket.c:2096 [inline] __se_sys_setsockopt net/socket.c:2093 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:2093 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4597c9 Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2670556c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004597c9 RDX: 0000000000000020 RSI: 0000000000000029 RDI: 0000000000000003 RBP: 000000000075bfc8 R08: 000000000000fdf7 R09: 0000000000000000 R10: 0000000020000000 R11: 0000000000000246 R12: 00007f26705576d4 R13: 00000000004cec00 R14: 00000000004dd520 R15: 00000000ffffffff Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 59c820b2 ("ipv6: elide flowlabel check if no exclusive leases exist") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Willem forgot to change one of the calls to fl6_sock_lookup(), which can now return an error or NULL. syzbot reported : kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 31763 Comm: syz-executor.0 Not tainted 5.2.0-rc6+ #63 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:ip6_datagram_dst_update+0x559/0xc30 net/ipv6/datagram.c:83 Code: 00 00 e8 ea 29 3f fb 4d 85 f6 0f 84 96 04 00 00 e8 dc 29 3f fb 49 8d 7e 20 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 16 06 00 00 4d 8b 6e 20 e8 b4 29 3f fb 4c 89 ee RSP: 0018:ffff88809ba97ae0 EFLAGS: 00010207 RAX: dffffc0000000000 RBX: ffff8880a81254b0 RCX: ffffc90008118000 RDX: 0000000000000003 RSI: ffffffff86319a84 RDI: 000000000000001e RBP: ffff88809ba97c10 R08: ffff888065e9e700 R09: ffffed1015d26c80 R10: ffffed1015d26c7f R11: ffff8880ae9363fb R12: ffff8880a8124f40 R13: 0000000000000001 R14: fffffffffffffffe R15: ffff88809ba97b40 FS: 00007f38e606a700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000202c0140 CR3: 00000000a026a000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __ip6_datagram_connect+0x5e9/0x1390 net/ipv6/datagram.c:246 ip6_datagram_connect+0x30/0x50 net/ipv6/datagram.c:269 ip6_datagram_connect_v6_only+0x69/0x90 net/ipv6/datagram.c:281 inet_dgram_connect+0x14a/0x2d0 net/ipv4/af_inet.c:571 __sys_connect+0x264/0x330 net/socket.c:1824 __do_sys_connect net/socket.c:1835 [inline] __se_sys_connect net/socket.c:1832 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1832 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4597c9 Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f38e6069c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004597c9 RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003 RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f38e606a6d4 R13: 00000000004bfd07 R14: 00000000004d1838 R15: 00000000ffffffff Modules linked in: RIP: 0010:ip6_datagram_dst_update+0x559/0xc30 net/ipv6/datagram.c:83 Code: 00 00 e8 ea 29 3f fb 4d 85 f6 0f 84 96 04 00 00 e8 dc 29 3f fb 49 8d 7e 20 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 16 06 00 00 4d 8b 6e 20 e8 b4 29 3f fb 4c 89 ee Fixes: 59c820b2 ("ipv6: elide flowlabel check if no exclusive leases exist") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
In 323a53c4 ("ipv6: tcp: enable flowlabel reflection in some RST packets") and 50a8accf ("ipv6: tcp: send consistent flowlabel in TIME_WAIT state") we took care of IPv6 flowlabel reflections for two cases. This patch takes care of the remaining case, when the RST packet is sent on behalf of a 'full' socket. In Marek use case, this was a socket in TCP_CLOSE state. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Marek Majkowski <marek@cloudflare.com> Tested-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
yangxingwu authored
The length of AH header is computed manually as (hp->hdrlen+2)<<2. However, in include/linux/ipv6.h, a macro named ipv6_authlen is already defined for exactly the same job. This commit replaces the manual computation code with the macro. Signed-off-by: yangxingwu <xingwu.yang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
Switching from ->priv_destructor to dellink() has an unexpected consequence: existing RCU readers, that is, hsr_port_get_hsr() callers, may still be able to read the port list. Instead of checking the return value of each hsr_port_get_hsr(), we can just move it to ->ndo_uninit() which is called after device unregister and synchronize_net(), and we still have RTNL lock there. Fixes: b9a1e627 ("hsr: implement dellink to clean up resources") Fixes: edf070a0 ("hsr: fix a NULL pointer deref in hsr_dev_xmit()") Reported-by: syzbot+097ef84cdc95843fbaa8@syzkaller.appspotmail.com Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Arguments are supposed to be ordered high then low. Fixes: 293e4365 ("stmmac: change descriptor layout") Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Arguments are supposed to be ordered high then low. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petar Penkov authored
Rules matching on loopback iif do not need early flow dissection as the packet originates from the host. Stop counting such rules in fib_rule_requires_fldissect Signed-off-by: Petar Penkov <ppenkov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Aya Levin authored
Check return value from mlx5e_attach_netdev, add error path on failure. Fixes: 48935bbb ("net/mlx5e: IPoIB, Add netdevice profile skeleton") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Aya Levin authored
Fix tx reporter's diagnose callback. Propagate error when failing to gather diagnostics information or failing to print diagnostic data per queue. Fixes: de8650a8 ("net/mlx5e: Add tx reporter support") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Aya Levin authored
Fix timeout recover function to return a meaningful return value. When an interrupt was not sent by the FW, return IO error instead of 'true'. Fixes: c7981bea ("net/mlx5e: Fix return status of TX reporter timeout recover") Signed-off-by: Aya Levin <ayal@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Saeed Mahameed authored
CQE checksum full mode in new HW, provides a full checksum of rx frame. Covering bytes starting from eth protocol up to last byte in the received frame (frame_size - ETH_HLEN), as expected by the stack. Fixing up skb->csum by the driver is not required in such case. This fix is to avoid wrong checksum calculation in drivers which already support the new hardware with the new checksum mode. Fixes: 85327a9c ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Eli Britstein authored
GRE entropy calculation is a single bit per card, and not per port. Force disable GRE entropy calculation upon the first GRE encap rule, and release the force at the last GRE encap rule removal. This is done per port. Fixes: 97417f61 ("net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation") Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Maor Gottlieb authored
Encap mode is related to switchdev mode only. Move the init of the encap mode to eswitch_offloads. Before this change, we reported that eswitch supports encap, even tough the device was in non SRIOV mode. Fixes: 7768d197 ('net/mlx5: E-Switch, Add control for encapsulation') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI fix from Rafael Wysocki: "Revert a recent ACPICA commit causing systems to hang at boot time" * tag 'acpi-5.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPICA: Update table load object initialization"
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds authored
Pull networking updates from David Miller: "Some highlights from this development cycle: 1) Big refactoring of ipv6 route and neigh handling to support nexthop objects configurable as units from userspace. From David Ahern. 2) Convert explored_states in BPF verifier into a hash table, significantly decreased state held for programs with bpf2bpf calls, from Alexei Starovoitov. 3) Implement bpf_send_signal() helper, from Yonghong Song. 4) Various classifier enhancements to mvpp2 driver, from Maxime Chevallier. 5) Add aRFS support to hns3 driver, from Jian Shen. 6) Fix use after free in inet frags by allocating fqdirs dynamically and reworking how rhashtable dismantle occurs, from Eric Dumazet. 7) Add act_ctinfo packet classifier action, from Kevin Darbyshire-Bryant. 8) Add TFO key backup infrastructure, from Jason Baron. 9) Remove several old and unused ISDN drivers, from Arnd Bergmann. 10) Add devlink notifications for flash update status to mlxsw driver, from Jiri Pirko. 11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski. 12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes. 13) Various enhancements to ipv6 flow label handling, from Eric Dumazet and Willem de Bruijn. 14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van der Merwe, and others. 15) Various improvements to axienet driver including converting it to phylink, from Robert Hancock. 16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean. 17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana Radulescu. 18) Add devlink health reporting to mlx5, from Moshe Shemesh. 19) Convert stmmac over to phylink, from Jose Abreu. 20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from Shalom Toledo. 21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera. 22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel. 23) Track spill/fill of constants in BPF verifier, from Alexei Starovoitov. 24) Support bounded loops in BPF, from Alexei Starovoitov. 25) Various page_pool API fixes and improvements, from Jesper Dangaard Brouer. 26) Just like ipv4, support ref-countless ipv6 route handling. From Wei Wang. 27) Support VLAN offloading in aquantia driver, from Igor Russkikh. 28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy. 29) Add flower GRE encap/decap support to nfp driver, from Pieter Jansen van Vuuren. 30) Protect against stack overflow when using act_mirred, from John Hurley. 31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen. 32) Use page_pool API in netsec driver, Ilias Apalodimas. 33) Add Google gve network driver, from Catherine Sullivan. 34) More indirect call avoidance, from Paolo Abeni. 35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan. 36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek. 37) Add MPLS manipulation actions to TC, from John Hurley. 38) Add sending a packet to connection tracking from TC actions, and then allow flower classifier matching on conntrack state. From Paul Blakey. 39) Netfilter hw offload support, from Pablo Neira Ayuso" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits) net/mlx5e: Return in default case statement in tx_post_resync_params mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync(). net: dsa: add support for BRIDGE_MROUTER attribute pkt_sched: Include const.h net: netsec: remove static declaration for netsec_set_tx_de() net: netsec: remove superfluous if statement netfilter: nf_tables: add hardware offload support net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload net: flow_offload: add flow_block_cb_is_busy() and use it net: sched: remove tcf block API drivers: net: use flow block API net: sched: use flow block API net: flow_offload: add flow_block_cb_{priv, incref, decref}() net: flow_offload: add list handling functions net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free() net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_* net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND net: flow_offload: add flow_block_cb_setup_simple() net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linuxLinus Torvalds authored
Pull clone3 system call from Christian Brauner: "This adds the clone3 syscall which is an extensible successor to clone after we snagged the last flag with CLONE_PIDFD during the 5.2 merge window for clone(). It cleanly supports all of the flags from clone() and thus all legacy workloads. There are few user visible differences between clone3 and clone. First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse this flag. Second, the CSIGNAL flag is deprecated and will cause EINVAL to be reported. It is superseeded by a dedicated "exit_signal" argument in struct clone_args thus freeing up even more flags. And third, clone3 gives CLONE_PIDFD a dedicated return argument in struct clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr argument. The clone3 uapi is designed to be easy to handle on 32- and 64 bit: /* uapi */ struct clone_args { __aligned_u64 flags; __aligned_u64 pidfd; __aligned_u64 child_tid; __aligned_u64 parent_tid; __aligned_u64 exit_signal; __aligned_u64 stack; __aligned_u64 stack_size; __aligned_u64 tls; }; and a separate kernel struct is used that uses proper kernel typing: /* kernel internal */ struct kernel_clone_args { u64 flags; int __user *pidfd; int __user *child_tid; int __user *parent_tid; int exit_signal; unsigned long stack; unsigned long stack_size; unsigned long tls; }; The system call comes with a size argument which enables the kernel to detect what version of clone_args userspace is passing in. clone3 validates that any additional bytes a given kernel does not know about are set to zero and that the size never exceeds a page. A nice feature is that this patchset allowed us to cleanup and simplify various core kernel codepaths in kernel/fork.c by making the internal _do_fork() function take struct kernel_clone_args even for legacy clone(). This patch also unblocks the time namespace patchset which wants to introduce a new CLONE_TIMENS flag. Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and xtensa. These were the architectures that did not require special massaging. Other architectures treat fork-like system calls individually and after some back and forth neither Arnd nor I felt confident that we dared to add clone3 unconditionally to all architectures. We agreed to leave this up to individual architecture maintainers. This is why there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3 which any architecture can set once it has implemented support for clone3. The patch also adds a cond_syscall(clone3) for architectures such as nios2 or h8300 that generate their syscall table by simply including asm-generic/unistd.h. The hope is to get rid of __ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon" * tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: arch: handle arches who do not yet define clone3 arch: wire-up clone3() syscall fork: add clone3
-
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linuxLinus Torvalds authored
Pull pidfd updates from Christian Brauner: "This adds two main features. - First, it adds polling support for pidfds. This allows process managers to know when a (non-parent) process dies in a race-free way. The notification mechanism used follows the same logic that is currently used when the parent of a task is notified of a child's death. With this patchset it is possible to put pidfds in an {e}poll loop and get reliable notifications for process (i.e. thread-group) exit. - The second feature compliments the first one by making it possible to retrieve pollable pidfds for processes that were not created using CLONE_PIDFD. A lot of processes get created with traditional PID-based calls such as fork() or clone() (without CLONE_PIDFD). For these processes a caller can currently not create a pollable pidfd. This is a problem for Android's low memory killer (LMK) and service managers such as systemd. Both patchsets are accompanied by selftests. It's perhaps worth noting that the work done so far and the work done in this branch for pidfd_open() and polling support do already see some adoption: - Android is in the process of backporting this work to all their LTS kernels [1] - Service managers make use of pidfd_send_signal but will need to wait until we enable waiting on pidfds for full adoption. - And projects I maintain make use of both pidfd_send_signal and CLONE_PIDFD [2] and will use polling support and pidfd_open() too" [1] https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22 https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22 https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22 [2] https://github.com/lxc/lxc/blob/aab6e3eb73c343231cdde775db938994fc6f2803/src/lxc/start.c#L1753 * tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add pidfd_open() tests arch: wire-up pidfd_open() pid: add pidfd_open() pidfd: add polling selftests pidfd: add polling support
-
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68kLinus Torvalds authored
Pull m68k fix from Geert Uytterhoeven: "Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire. This is a fix for an issue detected in next, to avoid introducing build failures when merging Christoph's dma-mapping tree later" * tag 'm68k-for-v5.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire
-
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommuLinus Torvalds authored
Pull m68nommu updates from Greg Ungerer: "A series of cleanups for the FLAT format binary loader, binfmt_flat, from Christoph. The end goal is to support no-MMU on RISC-V, and the last patch enables that" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: riscv: add binfmt_flat support binfmt_flat: don't offset the data start binfmt_flat: move the MAX_SHARED_LIBS definition to binfmt_flat.c binfmt_flat: remove the persistent argument from flat_get_addr_from_rp binfmt_flat: provide an asm-generic/flat.h binfmt_flat: make support for old format binaries optional binfmt_flat: add a ARCH_HAS_BINFMT_FLAT option binfmt_flat: add endianess annotations binfmt_flat: use fixed size type for the on-disk format binfmt_flat: consolidate two version of flat_v2_reloc_t binfmt_flat: remove the unused OLD_FLAT_FLAG_RAM definition binfmt_flat: remove the uapi <linux/flat.h> header binfmt_flat: replace flat_argvp_envp_on_stack with a Kconfig variable binfmt_flat: remove flat_old_ram_flag binfmt_flat: provide a default version of flat_get_relocate_addr binfmt_flat: remove flat_set_persistent binfmt_flat: remove flat_reloc_valid
-
git://linux-nfs.org/~bfields/linuxLinus Torvalds authored
Pull nfsd updates from Bruce Fields: "Highlights: - Add a new /proc/fs/nfsd/clients/ directory which exposes some long-requested information about NFSv4 clients (like open files) and allows forced revocation of client state. - Replace the global duplicate reply cache by a cache per network namespace; previously, a request in one network namespace could incorrectly match an entry from another, though we haven't seen this in production. This is the last remaining container bug that I'm aware of; at this point you should be able to run separate nfsd's in each network namespace, each with their own set of exports, and everything should work. - Cleanup and modify lock code to show the pid of lockd as the owner of NLM locks. This is the correct version of the bugfix originally attempted in b8eee0e9 ("lockd: Show pid of lockd for remote locks")" * tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd: Make __get_nfsdfs_client() static nfsd: Make two functions static nfsd: Fix misuse of strlcpy sunrpc/cache: remove the exporting of cache_seq_next nfsd: decode implementation id nfsd: create xdr_netobj_dup helper nfsd: allow forced expiration of NFSv4 clients nfsd: create get_nfsdfs_clp helper nfsd4: show layout stateids nfsd: show lock and deleg stateids nfsd4: add file to display list of client's opens nfsd: add more information to client info file nfsd: escape high characters in binary data nfsd: copy client's address including port number to cl_addr nfsd4: add a client info file nfsd: make client/ directory names small ints nfsd: add nfsd/clients directory nfsd4: use reference count to free client nfsd: rename cl_refcount nfsd: persist nfsd filesystem across mounts ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2Linus Torvalds authored
Pull gfs2 updates from Andreas Gruenbacher: "Some relatively minor changes for gfs2: - An initial batch of obvious cleanups and fixes from Bob's recovery patch queue. - Two iomap conversion patches and some cleanups from Christoph Hellwig. - A cosmetic cleanup from Kefeng Wang (Huawei). - Another minor fix and cleanup by me" * tag 'gfs2-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Remove unused gfs2_iomap_alloc argument gfs2: don't use buffer_heads in gfs2_allocate_page_backing gfs2: use iomap_bmap instead of generic_block_bmap gfs2: mark stuffed_readpage static gfs2: merge gfs2_writepage_common into gfs2_writepage gfs2: merge gfs2_writeback_aops and gfs2_ordered_aops gfs2: remove the unused gfs2_stuffed_write_end function gfs2: use page_offset in gfs2_page_mkwrite gfs2: replace more printk with calls to fs_info and friends gfs2: dump fsid when dumping glock problems gfs2: simplify gfs2_freeze by removing case gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN gfs2: Warn when a journal replay overwrites a rgrp with buffers gfs2: log which portion of the journal is replayed gfs2: eliminate tr_num_revoke_rm gfs2: kthread and remount improvements gfs2: Use IS_ERR_OR_NULL gfs2: Clean up freeing struct gfs2_sbd
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 updates from Ted Ts'o: "Many bug fixes and cleanups, and an optimization for case-insensitive lookups" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix coverity warning on error path of filename setup ext4: replace ktype default_attrs with default_groups ext4: rename htree_inline_dir_to_tree() to ext4_inlinedir_to_tree() ext4: refactor initialize_dirent_tail() ext4: rename "dirent_csum" functions to use "dirblock" ext4: allow directory holes jbd2: drop declaration of journal_sync_buffer() ext4: use jbd2_inode dirty range scoping jbd2: introduce jbd2_inode dirty range scoping mm: add filemap_fdatawait_range_keep_errors() ext4: remove redundant assignment to node ext4: optimize case-insensitive lookups ext4: make __ext4_get_inode_loc plug ext4: clean up kerneldoc warnigns when building with W=1 ext4: only set project inherit bit for directory ext4: enforce the immutable flag on open files ext4: don't allow any modifications to an immutable file jbd2: fix typo in comment of journal_submit_inode_data_buffers jbd2: fix some print format mistakes ext4: gracefully handle ext4_break_layouts() failure during truncate
-
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fsLinus Torvalds authored
Pull afs updates from David Howells: "A set of minor changes for AFS: - Remove an unnecessary check in afs_unlink() - Add a tracepoint for tracking callback management - Add a tracepoint for afs_server object usage - Use struct_size() - Add mappings for AFS UAE abort codes to Linux error codes, using symbolic names rather than hex numbers in the .c file" * tag 'afs-next-20190628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Add support for the UAE error table fs/afs: use struct_size() in kzalloc() afs: Trace afs_server usage afs: Add some callback management tracepoints afs: afs_unlink() doesn't need to check dentry->d_inode
-
git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds authored
Pull fscrypt updates from Eric Biggers: - Preparations for supporting encryption on ext4 filesystems where the filesystem block size is smaller than PAGE_SIZE. - Don't allow setting encryption policies on dead directories. - Various cleanups. * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: document testing with xfstests fscrypt: remove selection of CONFIG_CRYPTO_SHA256 fscrypt: remove unnecessary includes of ratelimit.h fscrypt: don't set policy for a dead directory ext4: encrypt only up to last block in ext4_bio_write_page() ext4: decrypt only the needed block in __ext4_block_zero_page_range() ext4: decrypt only the needed blocks in ext4_block_write_begin() ext4: clear BH_Uptodate flag on decryption error fscrypt: decrypt only the needed blocks in __fscrypt_decrypt_bio() fscrypt: support decrypting multiple filesystem blocks per page fscrypt: introduce fscrypt_decrypt_block_inplace() fscrypt: handle blocksize < PAGE_SIZE in fscrypt_zeroout_range() fscrypt: support encrypting multiple filesystem blocks per page fscrypt: introduce fscrypt_encrypt_block_inplace() fscrypt: clean up some BUG_ON()s in block encryption/decryption fscrypt: rename fscrypt_do_page_crypto() to fscrypt_crypt_block() fscrypt: remove the "write" part of struct fscrypt_ctx fscrypt: simplify bounce page handling
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds authored
Pull copy_file_range updates from Darrick Wong: "This fixes numerous parameter checking problems and inconsistent behaviors in the new(ish) copy_file_range system call. Now the system call will actually check its range parameters correctly; refuse to copy into files for which the caller does not have sufficient privileges; update mtime and strip setuid like file writes are supposed to do; and allows copying up to the EOF of the source file instead of failing the call like we used to. Summary: - Create a generic copy_file_range handler and make individual filesystems responsible for calling it (i.e. no more assuming that do_splice_direct will work or is appropriate) - Refactor copy_file_range and remap_range parameter checking where they are the same - Install missing copy_file_range parameter checking(!) - Remove suid/sgid and update mtime like any other file write - Change the behavior so that a copy range crossing the source file's eof will result in a short copy to the source file's eof instead of EINVAL - Permit filesystems to decide if they want to handle cross-superblock copy_file_range in their local handlers" * tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: fuse: copy_file_range needs to strip setuid bits and update timestamps vfs: allow copy_file_range to copy across devices xfs: use file_modified() helper vfs: introduce file_modified() helper vfs: add missing checks to copy_file_range vfs: remove redundant checks from generic_remap_checks() vfs: introduce generic_file_rw_checks() vfs: no fallback for ->copy_file_range vfs: introduce generic_copy_file_range()
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds authored
Pull iomap updates from Darrick Wong: "There are a few fixes for gfs2 but otherwise it's pretty quiet so far. - Only mark inode dirty at the end of writing to a file (instead of once for every page written). - Fix for an accounting error in the page_done callback" * tag 'iomap-5.3-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: fix page_done callback for short writes fs: fold __generic_write_end back into generic_write_end iomap: don't mark the inode dirty in iomap_write_end
-
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fsLinus Torvalds authored
Pull ext2, udf and quota updates from Jan Kara: - some ext2 fixes and cleanups - a fix of udf bug when extending files - a fix of quota Q_XGETQSTAT[V] handling * tag 'for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix incorrect final NOT_ALLOCATED (hole) extent length ext2: Use kmemdup rather than duplicating its implementation quota: honor quota type in Q_XGETQSTAT[V] calls ext2: Always brelse bh on failure in ext2_iget() ext2: add missing brelse() in ext2_iget() ext2: Fix a typo in ext2_getattr argument ext2: fix a typo in comment ext2: add missing brelse() in ext2_new_inode() ext2: optimize ext2_xattr_get() ext2: introduce new helper for xattr entry comparison ext2: merge xattr next entry check to ext2_xattr_entry_valid() ext2: code cleanup for ext2_preread_inode() ext2: code cleanup by using test_opt() and clear_opt() doc: ext2: update description of quota options for ext2 ext2: Strengthen xattr block checks ext2: Merge loops in ext2_xattr_set() ext2: introduce helper for xattr entry validation ext2: introduce helper for xattr header validation quota: add dqi_dirty_list description to comment of Dquot List Management
-
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fsLinus Torvalds authored
Pull fsnotify updates from Jan Kara: "This contains cleanups of the fsnotify name removal hook and also a patch to disable fanotify permission events for 'proc' filesystem" * tag 'fsnotify_for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: get rid of fsnotify_nameremove() fsnotify: move fsnotify_nameremove() hook out of d_delete() configfs: call fsnotify_rmdir() hook debugfs: call fsnotify_{unlink,rmdir}() hooks debugfs: simplify __debugfs_remove_file() devpts: call fsnotify_unlink() hook tracefs: call fsnotify_{unlink,rmdir}() hooks rpc_pipefs: call fsnotify_{unlink,rmdir}() hooks btrfs: call fsnotify_rmdir() hook fsnotify: add empty fsnotify_{unlink,rmdir}() hooks fanotify: Disallow permission events for proc filesystem
-
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linuxLinus Torvalds authored
Pull file locking updates from Jeff Layton: "Just a couple of small lease-related patches this cycle. One from Ira to add a new tracepoint that fires during lease conflict checks, and another patch from Amir to reduce false positives when checking for lease conflicts" * tag 'locks-v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: locks: eliminate false positive conflicts for write lease locks: Add trace_leases_conflict
-
Linus Torvalds authored
Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" This reverts merge 0f75ef6a (and thus effectively commits 7a1ade84 ("keys: Provide KEYCTL_GRANT_PERMISSION") 2e12256b ("keys: Replace uid/gid/perm permissions checking with an ACL") that the merge brought in). It turns out that it breaks booting with an encrypted volume, and Eric biggers reports that it also breaks the fscrypt tests [1] and loading of in-kernel X.509 certificates [2]. The root cause of all the breakage is likely the same, but David Howells is off email so rather than try to work it out it's getting reverted in order to not impact the rest of the merge window. [1] https://lore.kernel.org/lkml/20190710011559.GA7973@sol.localdomain/ [2] https://lore.kernel.org/lkml/20190710013225.GB7973@sol.localdomain/ Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/Reported-by: Eric Biggers <ebiggers@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 10 Jul, 2019 4 commits
-
-
Rafael J. Wysocki authored
Revert commit c522ad06 ("ACPICA: Update table load object initialization") as it causes systems to hang on attempts to load OEM ACPI tables. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Steven J. Magnani authored
In some cases, using the 'truncate' command to extend a UDF file results in a mismatch between the length of the file's extents (specifically, due to incorrect length of the final NOT_ALLOCATED extent) and the information (file) length. The discrepancy can prevent other operating systems (i.e., Windows 10) from opening the file. Two particular errors have been observed when extending a file: 1. The final extent is larger than it should be, having been rounded up to a multiple of the block size. B. The final extent is not shorter than it should be, due to not having been updated when the file's information length was increased. [JK: simplified udf_do_extend_final_block(), fixed up some types] Fixes: 2c948b3f ("udf: Avoid IO in udf_clear_inode") CC: stable@vger.kernel.org Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Link: https://lore.kernel.org/r/1561948775-5878-1-git-send-email-steve@digidescorp.comSigned-off-by: Jan Kara <jack@suse.cz>
-
Nathan Chancellor authored
clang warns: drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:251:2: warning: variable 'rec_seq_sz' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] default: ^~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:255:46: note: uninitialized use occurs here skip_static_post = !memcmp(rec_seq, &rn_be, rec_seq_sz); ^~~~~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:239:16: note: initialize the variable 'rec_seq_sz' to silence this warning u16 rec_seq_sz; ^ = 0 1 warning generated. This case statement was clearly designed to be one that should not be hit during runtime because of the WARN_ON statement so just return early to prevent copying uninitialized memory up into rn_be. Fixes: d2ead1f3 ("net/mlx5e: Add kTLS TX HW offload support") Link: https://github.com/ClangBuiltLinux/linux/issues/590Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Return value was changes to 'int' from void but this return statement was not updated, or it slipped in via a merge. Fixes: b5d9a834 ("net/tls: don't clear TX resync flag on error") Signed-off-by: David S. Miller <davem@davemloft.net>
-