- 02 Sep, 2016 1 commit
-
-
Michal Hocko authored
There have been several reports about pre-mature OOM killer invocation in 4.7 kernel when order-2 allocation request (for the kernel stack) invoked OOM killer even during basic workloads (light IO or even kernel compile on some filesystems). In all reported cases the memory is fragmented and there are no order-2+ pages available. There is usually a large amount of slab memory (usually dentries/inodes) and further debugging has shown that there are way too many unmovable blocks which are skipped during the compaction. Multiple reporters have confirmed that the current linux-next which includes [1] and [2] helped and OOMs are not reproducible anymore. A simpler fix for the late rc and stable is to simply ignore the compaction feedback and retry as long as there is a reclaim progress and we are not getting OOM for order-0 pages. We already do that for CONFING_COMPACTION=n so let's reuse the same code when compaction is enabled as well. [1] http://lkml.kernel.org/r/20160810091226.6709-1-vbabka@suse.cz [2] http://lkml.kernel.org/r/f7a9ea9d-bb88-bfd6-e340-3a933559305a@suse.cz Fixes: 0a0337e0 ("mm, oom: rework oom detection") Link: http://lkml.kernel.org/r/20160823074339.GB23577@dhcp22.suse.czSigned-off-by: Michal Hocko <mhocko@suse.com> Tested-by: Olaf Hering <olaf@aepfle.de> Tested-by: Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Cc: Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [4.7.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 31 Aug, 2016 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
Pull crypto fixes from Herbert Xu: "This fixes the following issues: - Kconfig problem that prevented mxc-rnga from being enabled - bogus key sizes in qat aes-xts - buggy aes-xts code in vmx" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: vmx - fix null dereference in p8_aes_xts_crypt crypto: qat - fix aes-xts key sizes hwrng: mxc-rnga - Fix Kconfig dependency
-
Linus Torvalds authored
We used to delay switching to the new credentials until after we had mapped the executable (and possible elf interpreter). That was kind of odd to begin with, since the new executable will actually then _run_ with the new creds, but whatever. The bigger problem was that we also want to make sure that we turn off prof events and tracing before we start mapping the new executable state. So while this is a cleanup, it's also a fix for a possible information leak. Reported-by: Robert Święcki <robert@swiecki.net> Tested-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 30 Aug, 2016 11 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull seccomp fix from Kees Cook: "Fix fatal signal delivery after ptrace reordering" * tag 'seccomp-v4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Fix tracer exit notifications during fatal signals
-
Kees Cook authored
This fixes a ptrace vs fatal pending signals bug as manifested in seccomp now that seccomp was reordered to happen after ptrace. The short version is that seccomp should not attempt to call do_exit() while fatal signals are pending under a tracer. The existing code was trying to be as defensively paranoid as possible, but it now ends up confusing ptrace. Instead, the syscall can just be skipped (which solves the original concern that the do_exit() was addressing) and normal signal handling, tracer notification, and process death can happen. Paraphrasing from the original bug report: If a tracee task is in a PTRACE_EVENT_SECCOMP trap, or has been resumed after such a trap but not yet been scheduled, and another task in the thread-group calls exit_group(), then the tracee task exits without the ptracer receiving a PTRACE_EVENT_EXIT notification. Test case here: https://gist.github.com/khuey/3c43ac247c72cef8c956ca73281c9be7 The bug happens because when __seccomp_filter() detects fatal_signal_pending(), it calls do_exit() without dequeuing the fatal signal. When do_exit() sends the PTRACE_EVENT_EXIT notification and that task is descheduled, __schedule() notices that there is a fatal signal pending and changes its state from TASK_TRACED to TASK_RUNNING. That prevents the ptracer's waitpid() from returning the ptrace event. A more detailed analysis is here: https://github.com/mozilla/rr/issues/1762#issuecomment-237396255. Reported-by: Robert O'Callahan <robert@ocallahan.org> Reported-by: Kyle Huey <khuey@kylehuey.com> Tested-by: Kyle Huey <khuey@kylehuey.com> Fixes: 93e35efb ("x86/ptrace: run seccomp after ptrace") Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/shli/mdLinus Torvalds authored
Pull MD fixes from Shaohua Li: "This includes several bug fixes: - Alexey Obitotskiy fixed a hang for faulty raid5 array with external management - Song Liu fixed two raid5 journal related bugs - Tomasz Majchrzak fixed a bad block recording issue and an accounting issue for raid10 - ZhengYuan Liu fixed an accounting issue for raid5 - I fixed a potential race condition and memory leak with DIF/DIX enabled - other trival fixes" * tag 'md/4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: raid5: avoid unnecessary bio data set raid5: fix memory leak of bio integrity data raid10: record correct address of bad block md-cluster: fix error return code in join() r5cache: set MD_JOURNAL_CLEAN correctly md: don't print the same repeated messages about delayed sync operation md: remove obsolete ret in md_start_sync md: do not count journal as spare in GET_ARRAY_INFO md: Prevent IO hold during accessing to faulty raid5 array MD: hold mddev lock to change bitmap location raid5: fix incorrectly counter of conf->empty_inactive_list_nr raid10: increment write counter after bio is split
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds authored
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable patches: - Fix a refcount leak in nfs_callback_up_net - Fix an Oopsable condition when the flexfile pNFS driver connection to the DS fails - Fix an Oopsable condition in NFSv4.1 server callback races - Ensure pNFS clients stop doing I/O to the DS if their lease has expired, as required by the NFSv4.1 protocol Bugfixes: - Fix potential looping in the NFSv4.x migration code - Patch series to close callback races for OPEN, LAYOUTGET and LAYOUTRETURN - Silence WARN_ON when NFSv4.1 over RDMA is in use - Fix a LAYOUTCOMMIT race in the pNFS/blocks client - Fix pNFS timeout issues when the DS fails" * tag 'nfs-for-4.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4.x: Fix a refcount leak in nfs_callback_up_net NFS4: Avoid migration loops pNFS/flexfiles: Fix an Oopsable condition when connection to the DS fails NFSv4.1: Remove obsolete and incorrrect assignment in nfs4_callback_sequence NFSv4.1: Close callback races for OPEN, LAYOUTGET and LAYOUTRETURN NFSv4.1: Defer bumping the slot sequence number until we free the slot NFSv4.1: Delay callback processing when there are referring triples NFSv4.1: Fix Oopsable condition in server callback races SUNRPC: Silence WARN_ON when NFSv4.1 over RDMA is in use pnfs/blocklayout: update last_write_offset atomically with extents pNFS: The client must not do I/O to the DS if it's lease has expired pNFS: Handle NFS4ERR_OLD_STATEID correctly in LAYOUTSTAT calls pNFS/flexfiles: Set reasonable default retrans values for the data channel NFS: Allow the mount option retrans=0 pNFS/flexfiles: Fix layoutstat periodic reporting
-
Josh Poimboeuf authored
There are three usercopy warnings which are currently being silenced for gcc 4.6 and newer: 1) "copy_from_user() buffer size is too small" compile warning/error This is a static warning which happens when object size and copy size are both const, and copy size > object size. I didn't see any false positives for this one. So the function warning attribute seems to be working fine here. Note this scenario is always a bug and so I think it should be changed to *always* be an error, regardless of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS. 2) "copy_from_user() buffer size is not provably correct" compile warning This is another static warning which happens when I enable __compiletime_object_size() for new compilers (and CONFIG_DEBUG_STRICT_USER_COPY_CHECKS). It happens when object size is const, but copy size is *not*. In this case there's no way to compare the two at build time, so it gives the warning. (Note the warning is a byproduct of the fact that gcc has no way of knowing whether the overflow function will be called, so the call isn't dead code and the warning attribute is activated.) So this warning seems to only indicate "this is an unusual pattern, maybe you should check it out" rather than "this is a bug". I get 102(!) of these warnings with allyesconfig and the __compiletime_object_size() gcc check removed. I don't know if there are any real bugs hiding in there, but from looking at a small sample, I didn't see any. According to Kees, it does sometimes find real bugs. But the false positive rate seems high. 3) "Buffer overflow detected" runtime warning This is a runtime warning where object size is const, and copy size > object size. All three warnings (both static and runtime) were completely disabled for gcc 4.6 with the following commit: 2fb0815c ("gcc4: disable __compiletime_object_size for GCC 4.6+") That commit mistakenly assumed that the false positives were caused by a gcc bug in __compiletime_object_size(). But in fact, __compiletime_object_size() seems to be working fine. The false positives were instead triggered by #2 above. (Though I don't have an explanation for why the warnings supposedly only started showing up in gcc 4.6.) So remove warning #2 to get rid of all the false positives, and re-enable warnings #1 and #3 by reverting the above commit. Furthermore, since #1 is a real bug which is detected at compile time, upgrade it to always be an error. Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer needed. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Nilay Vaish <nilayvaish@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libataLinus Torvalds authored
Pull libata fixes from Tejun Heo: "Two libata driver specific fixes for v4.8-rc4. Nothing too scary" * 'for-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: pata_ninja32: Avoid corrupting status flags ahci: disable correct irq for dummy ports
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds authored
Pull cgroup fixes from Tejun Heo: "Two fixes for cgroup. - There still was a hole in enforcing cpuset rules, fixed by Li. - The recent switch to global percpu_rwseom for threadgroup locking revealed a couple issues in how percpu_rwsem is implemented and used by cgroup. Balbir found that the read locking section was too wide unnecessarily including operations which can often depend on IOs. With percpu_rwsem updates (coming through a different tree) and reduction of read locking section, all the reported locking latency issues, including the android one, are resolved. It looks like we can keep global percpu_rwsem locking for now. If there actually are cases which can't be resolved, we can go back to more complex per-signal_struct locking" * 'for-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork cpuset: make sure new tasks conform to the current config of the cpuset
-
Alan Cox authored
Ninja32 needs to set some flags to indicate it does 32bit IO. However it currently assigns this which loses the initializing flag and causes a warning spew. Fix it to use a logical or as is intended. Signed-off-by: Alan Cox <alan@linux.intel.com> Tested-by: Ellmar Stelnberger <estellnb@elstel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
-
Trond Myklebust authored
On error, the callers expect us to return without bumping nn->cb_users[]. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v3.7+
-
Benjamin Coddington authored
If a server returns itself as a location while migrating, the client may end up getting stuck attempting to migrate twice to the same server. Catch this by checking if the nfs_client found is the same as the existing client. For the other two callers to nfs4_set_client, the nfs_client will always be ERR_PTR(-EINVAL). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Linus Torvalds authored
Merge tag 'hwmon-for-linus-v4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fix from Guenter Roeck: "Add missing sysfs attribute group terminator to it87 driver" * tag 'hwmon-for-linus-v4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (it87) Add missing sysfs attribute group terminator
-
- 29 Aug, 2016 24 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 fixes from Ted Ts'o: "Fix bugs that could cause kernel deadlocks or file system corruption while moving xattrs to expand the extended inode. Also add some sanity checks to the block group descriptors to make sure we don't end up overwriting the superblock" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: avoid deadlock when expanding inode size ext4: properly align shifted xattrs when expanding inodes ext4: fix xattr shifting when expanding inodes part 2 ext4: fix xattr shifting when expanding inodes ext4: validate that metadata blocks do not overlap superblock ext4: reserve xattr index for the Hurd
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Segregate namespaces properly in conntrack dumps, from Liping Zhang. 2) tcp listener refcount fix in netfilter tproxy, from Eric Dumazet. 3) Fix timeouts in qed driver due to xmit_more, from Yuval Mintz. 4) Fix use-after-free in tcp_xmit_retransmit_queue(). 5) Userspace header fixups (use of __u32, missing includes, etc.) from Mikko Rapeli. 6) Further refinements to fragmentation wrt gso and tunnels, from Shmulik Ladkani. 7) Trigger poll correctly for zero length UDP packets, from Eric Dumazet. 8) TCP window scaling fix, also from Eric Dumazet. 9) SLAB_DESTROY_BY_RCU is not relevant any more for UDP sockets. 10) Module refcount leak in qdisc_create_dflt(), from Eric Dumazet. 11) Fix deadlock in cp_rx_poll() of 8139cp driver, from Gao Feng. 12) Memory leak in rhashtable's alloc_bucket_locks(), from Eric Dumazet. 13) Add new device ID to alx driver, from Owen Lin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits) Add Killer E2500 device ID in alx driver. net: smc91x: fix SMC accesses Documentation: networking: dsa: Remove platform device TODO net/mlx5: Increase number of ethtool steering priorities net/mlx5: Add error prints when validate ETS failed net/mlx5e: Fix memory leak if refreshing TIRs fails net/mlx5e: Add ethtool counter for TX xmit_more net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ net/mlx5e: Don't wait for SQ completions on close net/mlx5e: Don't post fragmented MPWQE when RQ is disabled net/mlx5e: Don't wait for RQ completions on close net/mlx5e: Limit UMR length to the device's limitation rhashtable: fix a memory leak in alloc_bucket_locks() sfc: fix potential stack corruption from running past stat bitmask team: loadbalance: push lacpdus to exact delivery net: hns: dereference ppe_cb->ppe_common_cb if it is non-null 8139cp: Fix one possible deadloop in cp_rx_poll i40e: Change some init flow for the client Revert "phy: IRQ cannot be shared" net: dsa: bcm_sf2: Fix race condition while unmasking interrupts ...
-
Trond Myklebust authored
If the attempt to connect to a DS fails inside ff_layout_pg_init_read or ff_layout_pg_init_write, then we currently end up clearing the layout segment carried by the struct nfs_pageio_descriptor, causing an Oops when we later call into ff_layout_read_pagelist/ff_layout_write_pagelist. The fix is to ensure we return the layout and then retry. Fixes: 446ca219 ("pNFS/flexfiles: When initing reads or writes, we...") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
-
Linus Torvalds authored
Merge tag 'platform-drivers-x86-v4.8-4' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86 Pull x86 platform driver fixes from Darren Hart: "Remove module related code from two drivers that are only configurable as built-in: intel_pmic_gpio and platform/olpc" * tag 'platform-drivers-x86-v4.8-4' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: intel_pmic_gpio: Make explicitly non-modular platform/olpc: Make ec explicitly non-modular
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Ben Herrenschmidt: "This was meant to be sent early last week, but I has a change pending on one of the fixes and other things made me forget all about. Ugh. We have some misc fixes for powerpc 4.8. Some trivial bits and some regressions, and a trivial cleanup or two that I saw no point in letting rot in patchwork" * tag 'powerpc-4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: signals: Discard transaction state from signal frames powerpc/powernv : Drop reference added by kset_find_obj() powerpc/tm: do not use r13 for tabort_syscall powerpc: move hmi.c to arch/powerpc/kvm/ powerpc: sysdev: cpm: fix gpio save_regs functions powerpc/pseries: PACA save area fix for MCE vs MCE powerpc/pseries: PACA save area fix for general exception vs MCE powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support powerpc, hotplug: Avoid to touch non-existent cpumasks. powerpc: migrate exception table users off module.h and onto extable.h powerpc/powernv/pci: fix iterator signedness powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb) cxl: use pcibios_free_controller_deferred() when removing vPHBs powerpc: mpc8349emitx: Delete unnecessary assignment for the field "owner" powerpc/512x: Delete unnecessary assignment for the field "owner" drivers/macintosh: Delete owner assignment powerpc: cputhreads: Add missing include file
-
Jean Delvare authored
Attribute array it87_attributes_in lacks its NULL terminator, causing random behavior when operating on the attribute group. Fixes: 52929715 ("hwmon: (it87) Use is_visible for voltage sensors") Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
-
Paul Gortmaker authored
The Kconfig entry controlling compilation of this code is: drivers/platform/x86/Kconfig:config GPIO_INTEL_PMIC drivers/platform/x86/Kconfig: bool "Intel PMIC GPIO support" ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modular infrastructure use, so that when reading the driver there is no doubt it is builtin-only. We delete the MODULE_LICENSE tag etc. since all that information was (or is now) contained at the top of the file in the comments. We don't replace module.h with init.h since the file already has that. Cc: Alek Du <alek.du@intel.com> Cc: platform-driver-x86@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
-
Paul Gortmaker authored
The Kconfig entry controlling compilation of this code is: arch/x86/Kconfig:config OLPC arch/x86/Kconfig: bool "One Laptop Per Child support" ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modular infrastructure use, so that when reading the driver there is no doubt it is builtin-only. We delete the MODULE_LICENSE tag etc. since all that information was (or is now) contained at the top of the file in the comments. Cc: platform-driver-x86@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Andres Salomon <dilinger@queued.net> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
-
Owen Lin authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
Commit b70661c7 ("net: smc91x: use run-time configuration on all ARM machines") broke some ARM platforms through several mistakes. Firstly, the access size must correspond to the following rule: (a) at least one of 16-bit or 8-bit access size must be supported (b) 32-bit accesses are optional, and may be enabled in addition to the above. Secondly, it provides no emulation of 16-bit accesses, instead blindly making 16-bit accesses even when the platform specifies that only 8-bit is supported. Reorganise smc91x.h so we can make use of the existing 16-bit access emulation already provided - if 16-bit accesses are supported, use 16-bit accesses directly, otherwise if 8-bit accesses are supported, use the provided 16-bit access emulation. If neither, BUG(). This exactly reflects the driver behaviour prior to the commit being fixed. Since the conversion incorrectly cut down the available access sizes on several platforms, we also need to go through every platform and fix up the overly-restrictive access size: Arnd assumed that if a platform can perform 32-bit, 16-bit and 8-bit accesses, then only a 32-bit access size needed to be specified - not so, all available access sizes must be specified. This likely fixes some performance regressions in doing this: if a platform does not support 8-bit accesses, 8-bit accesses have been emulated by performing a 16-bit read-modify-write access. Tested on the Intel Assabet/Neponset platform, which supports only 8-bit accesses, which was broken by the original commit. Fixes: b70661c7 ("net: smc91x: use run-time configuration on all ARM machines") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Since commit 83c0afae ("net: dsa: Add new binding implementation"), the shortcomings of the dsa platform device have been addressed, remove that TODO item. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Saeed Mahameed says: ==================== Mellanox 100G mlx5 fixes 2016-08-29 This series contains some bug fixes for the mlx5 core and mlx5 ethernet driver. From Saeed, Fix UMR to consider hardware translation table field size limitation when calculating the maximum number of MTTs required by the driver. Three patches to speed-up netdevice close time by serializing channel (SQs & RQs) destruction rather than issuing and waiting for hardware interrupts to free them. From Eran, Fix ethtool ring parameter reporting for striding RQ layout. Add error prints on ETS validation failure. From Kamal, Fix memory leak on error flow. From Maor, Fix ethtool steering priorities number. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maor Gottlieb authored
Ethtool has 11 flow tables, each flow table has its own priority. Increase the number of priorities to be aligned with the number of flow tables. Fixes: 1174fce8 ('net/mlx5e: Support l3/l4 flow type specs in ethtool flow steering') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eran Ben Elisha authored
Upon set ETS failure due to user invalid input, add error prints to specify the exact error to the user. Fixes: cdcf1121 ('net/mlx5e: Validate BW weight values of ETS') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kamal Heib authored
Free 'in' command object also when mlx5_core_modify_tir fails. Fixes: 724b2aa1 ("net/mlx5e: TIRs management refactoring") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Add a counter in ethtool for the number of times that TX xmit_more was used. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eran Ben Elisha authored
The driver RQ has two possible configurations: striding RQ and non-striding RQ. Until this patch, the driver always reported the number of hardware WQEs (ring descriptors). For non striding RQ configuration, this was OK since we have one WQE per pending packet For striding RQ, multiple packets can fit into one WQE. For better user experience we normalize the rx_pending parameter (size of wqe/mtu) as the average ring size in case of striding RQ. Fixes: 461017cb ('net/mlx5e: Support RX multi-packet WQE ...') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
Instead of asking the firmware to flush the SQ (Send Queue) via asynchronous completions when moved to error, we handle SQ flush manually (mlx5e_free_tx_descs) same as we did when SQ flush got timed out or on tx_timeout. This will reduce SQs flush time and speedup interface down procedure. Moved mlx5e_free_tx_descs to the end of en_tx.c for tx critical code locality. Fixes: 29429f33 ('net/mlx5e: Timeout if SQ doesn't flush during close') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
ICO (Internal control operations) SQ (Send Queue) is closed/disabled after RQ (Receive Queue). After RQ is closed an ICO SQ completion might post a fragmented MPWQE (Multi Packet Work Queue Element) into that RQ. As on regular RQ post, check if we are allowed to post to that RQ (RQ is enabled). Cleanup in-progress UMR MPWQE on mlx5e_free_rx_descs if needed. Fixes: bc77b240 ('net/mlx5e: Add fragmented memory support for RX multi packet WQE') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
This will significantly reduce receive queue flush time on interface down. Instead of asking the firmware to flush the RQ (Receive Queue) via asynchronous completions when moved to error, we handle RQ flush manually (mlx5e_free_rx_descs) same as we did when RQ flush got timed out. This will reduce RQs flush time and speedup interface down procedure (ifconfig down) from 6 sec to 0.3 sec on a 48 cores system. Moved mlx5e_free_rx_descs en_main.c where it is needed, to keep en_rx.c free form non critical data path code for better code locality. Fixes: 6cd392a0 ('net/mlx5e: Handle RQ flush in error cases') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
ConnectX-4 UMR (User Memory Region) MTT translation table offset in WQE is limited to U16_MAX, before this patch we ignored that limitation and requested the maximum possible UMR translation length that the netdev might need (MAX channels * MAX pages per channel). In case of a system with #cores > 32 and when linear WQE allocation fails, falling back to using UMR WQEs will cause the RQ (Receive Queue) to get stuck. Here we limit UMR length to min(U16_MAX, max required pages) (while considering the required alignments) on driver load, by default U16_MAX is sufficient since the default RX rings value guarantees that we are in range, dynamically (on set_ringparam/set_channels) we will check if the new required UMR length (num mtts) is still in range, if not, fail the request. Fixes: bc77b240 ('net/mlx5e: Add fragmented memory support for RX multi packet WQE') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cyril Bur authored
Userspace can begin and suspend a transaction within the signal handler which means they might enter sys_rt_sigreturn() with the processor in suspended state. sys_rt_sigreturn() wants to restore process context (which may have been in a transaction before signal delivery). To do this it must restore TM SPRS. To achieve this, any transaction initiated within the signal frame must be discarded in order to be able to restore TM SPRs as TM SPRs can only be manipulated non-transactionally.. >From the PowerPC ISA: TM Bad Thing Exception [Category: Transactional Memory] An attempt is made to execute a mtspr targeting a TM register in other than Non-transactional state. Not doing so results in a TM Bad Thing: [12045.221359] Kernel BUG at c000000000050a40 [verbose debug info unavailable] [12045.221470] Unexpected TM Bad Thing exception at c000000000050a40 (msr 0x201033) [12045.221540] Oops: Unrecoverable exception, sig: 6 [#1] [12045.221586] SMP NR_CPUS=2048 NUMA PowerNV [12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c [12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 #34 [12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti: c0000000fceb4000 [12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR: 0000000000000000 [12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700 Not tainted (4.7.0) [12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]> CR: 28444280 XER: 20000000 [12045.222625] CFAR: c0000000000163b8 SOFTE: 0 PACATMSCRATCH: 900000014280f033 GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0 GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000 GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000 GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0 [12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c [12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0 [12045.223630] Call Trace: [12045.223655] [c0000000fceb7d80] [c000000000026e74] sys_rt_sigreturn+0x494/0x6c0 [12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108 [12045.223806] Instruction dump: [12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8 [12045.223955] 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020 [12045.224074] ---[ end trace cb8002ee240bae76 ]--- It isn't clear exactly if there is really a use case for userspace returning with a suspended transaction, however, doing so doesn't (on its own) constitute a bad frame. As such, this patch simply discards the transactional state of the context calling the sigreturn and continues. Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Acked-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Mukesh Ojha authored
In a situation, where Linux kernel gets notified about duplicate error log from OPAL, it is been observed that kernel fails to remove sysfs entries (/sys/firmware/opal/elog/0xXXXXXXXX) of such error logs. This is because, we currently search the error log/dump kobject in the kset list via 'kset_find_obj()' routine. Which eventually increment the reference count by one, once it founds the kobject. So, unless we decrement the reference count by one after it found the kobject, we would not be able to release the kobject properly later. This patch adds the 'kobject_put()' which was missing earlier. Signed-off-by: Mukesh Ojha <mukesh02@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Nicholas Piggin authored
tabort_syscall runs with RI=1, so a nested recoverable machine check will load the paca into r13 and overwrite what we loaded it with, because exceptions returning to privileged mode do not restore r13. Fixes: b4b56f9e (powerpc/tm: Abort syscalls in active transactions) Cc: stable@vger.kernel.org Signed-off-by: Nick Piggin <npiggin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 28 Aug, 2016 2 commits
-
-
Linus Torvalds authored
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "A bunch of fixes covering i915, amdgpu, one tegra and some core DRM ones. Nothing too strange at this point" * tag 'drm-fixes-for-4.8-rc4' of git://people.freedesktop.org/~airlied/linux: (21 commits) drm/atomic: Don't potentially reset color_mgmt_changed on successive property updates. drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION drm/amdgpu: skip TV/CV in display parsing drm/amdgpu: avoid a possible array overflow drm/amdgpu: fix lru size grouping v2 drm/tegra: dsi: Enhance runtime power management drm/i915: Fix botched merge that downgrades CSR versions. drm/i915/skl: Ensure pipes with changed wms get added to the state drm/i915/gen9: Only copy WM results for changed pipes to skl_hw drm/i915/skl: Add support for the SAGV, fix underrun hangs drm/i915/gen6+: Interpret mailbox error flags drm/i915: Reattach comment, complete type specification drm/i915: Unconditionally flush any chipset buffers before execbuf drm/i915/gen9: Drop invalid WARN() during data rate calculation drm/i915/gen9: Initialize intel_state->active_crtcs during WM sanitization (v2) drm: Reject page_flip for !DRIVER_MODESET drm/amdgpu: fix timeout value check in amd_sched_job_recovery drm/amdgpu: fix sdma_v2_4_ring_test_ib drm/amdgpu: fix amdgpu_move_blit on 32bit systems drm/radeon: fix radeon_move_blit on 32bit systems ...
-