- 27 Jan, 2021 2 commits
-
-
Valentin Schneider authored
The deduplicating sort in sched_init_numa() assumes that the first line in the distance table contains all unique values in the entire table. I've been trying to pen what this exactly means for the topology, but it's not straightforward. For instance, topology.c uses this example: node 0 1 2 3 0: 10 20 20 30 1: 20 10 20 20 2: 20 20 10 20 3: 30 20 20 10 0 ----- 1 | / | | / | | / | 2 ----- 3 Which works out just fine. However, if we swap nodes 0 and 1: 1 ----- 0 | / | | / | | / | 2 ----- 3 we get this distance table: node 0 1 2 3 0: 10 20 20 20 1: 20 10 20 30 2: 20 20 10 20 3: 20 30 20 10 Which breaks the deduplicating sort (non-representative first line). In this case this would just be a renumbering exercise, but it so happens that we can have a deduplicating sort that goes through the whole table in O(n²) at the extra cost of a temporary memory allocation (i.e. any form of set). The ACPI spec (SLIT) mentions distances are encoded on 8 bits. Following this, implement the set as a 256-bits bitmap. Should this not be satisfactory (i.e. we want to support 32-bit values), then we'll have to go for some other sparse set implementation. This has the added benefit of letting us allocate just the right amount of memory for sched_domains_numa_distance[], rather than an arbitrary (nr_node_ids + 1). Note: DT binding equivalent (distance-map) decodes distances as 32-bit values. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210122123943.1217-2-valentin.schneider@arm.com
-
Qais Yousef authored
If the task is pinned to a cpu, setting the misfit status means that we'll unnecessarily continuously attempt to migrate the task but fail. This continuous failure will cause the balance_interval to increase to a high value, and eventually cause unnecessary significant delays in balancing the system when real imbalance happens. Caught while testing uclamp where rt-app calibration loop was pinned to cpu 0, shortly after which we spawn another task with high util_clamp value. The task was failing to migrate after over 40ms of runtime due to balance_interval unnecessary expanded to a very high value from the calibration loop. Not done here, but it could be useful to extend the check for pinning to verify that the affinity of the task has a cpu that fits. We could end up in a similar situation otherwise. Fixes: 3b1baa64 ("sched/fair: Add 'group_misfit_task' load-balance type") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Acked-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210119120755.2425264-1-qais.yousef@arm.com
-
- 14 Jan, 2021 10 commits
-
-
Hui Su authored
Use the task_current() function where appropriate. No functional change. Signed-off-by: Hui Su <sh_def@163.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20201030173223.GA52339@rlk
-
Vincent Guittot authored
Active balance is triggered for a number of voluntary cases like misfit or pinned tasks cases but also after that a number of load balance attempts failed to migrate a task. There is no need to use active load balance when the group is overloaded because an overloaded state means that there is at least one waiting task. Nevertheless, the waiting task is not selected and detached until the threshold becomes higher than its load. This threshold increases with the number of failed lb (see the condition if ((load >> env->sd->nr_balance_failed) > env->imbalance) in detach_tasks()) and the waiting task will end up to be selected after a number of attempts. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lkml.kernel.org/r/20210107103325.30851-4-vincent.guittot@linaro.org
-
Vincent Guittot authored
Setting LBF_ALL_PINNED during active load balance is only valid when there is only 1 running task on the rq otherwise this ends up increasing the balance interval whereas other tasks could migrate after the next interval once they become cache-cold as an example. LBF_ALL_PINNED flag is now always set it by default. It is then cleared when we find one task that can be pulled when calling detach_tasks() or during active migration. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lkml.kernel.org/r/20210107103325.30851-3-vincent.guittot@linaro.org
-
Vincent Guittot authored
Don't waste time checking whether an idle cfs_rq could be the busiest queue. Furthermore, this can end up selecting a cfs_rq with a high load but being idle in case of migrate_load. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lkml.kernel.org/r/20210107103325.30851-2-vincent.guittot@linaro.org
-
Xuewen Yan authored
CPU (root cfs_rq) estimated utilization (util_est) is currently used in dequeue_task_fair() to drive frequency selection before it is updated. with: CPU_util : rq->cfs.avg.util_avg CPU_util_est : rq->cfs.avg.util_est CPU_utilization : max(CPU_util, CPU_util_est) task_util : p->se.avg.util_avg task_util_est : p->se.avg.util_est dequeue_task_fair(): /* (1) CPU_util and task_util update + inform schedutil about CPU_utilization changes */ for_each_sched_entity() /* 2 loops */ (dequeue_entity() ->) update_load_avg() -> cfs_rq_util_change() -> cpufreq_update_util() ->...-> sugov_update_[shared\|single] -> sugov_get_util() -> cpu_util_cfs() /* (2) CPU_util_est and task_util_est update */ util_est_dequeue() cpu_util_cfs() uses CPU_utilization which could lead to a false (too high) utilization value for schedutil in task ramp-down or ramp-up scenarios during task dequeue. To mitigate the issue split the util_est update (2) into: (A) CPU_util_est update in util_est_dequeue() (B) task_util_est update in util_est_update() Place (A) before (1) and keep (B) where (2) is. The latter is necessary since (B) relies on task_util update in (1). Fixes: 7f65ea42 ("sched/fair: Add util_est on top of PELT") Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/1608283672-18240-1-git-send-email-xuewen.yan94@gmail.com
-
Peter Zijlstra authored
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Morten Rasmussen <morten.rasmussen@arm.com> Link: https://lkml.kernel.org/r/20201218103258.GA3040@hirez.programming.kicks-ass.net
-
Anna-Maria Behnsen authored
SCHED_SOFTIRQ is raised to trigger periodic load balancing. When CPU is not active, CPU should not participate in load balancing. The scheduler uses nohz.idle_cpus_mask to keep track of the CPUs which can do idle load balancing. When bringing a CPU up the CPU is added to the mask when it reaches the active state, but on teardown the CPU stays in the mask until it goes offline and invokes sched_cpu_dying(). When SCHED_SOFTIRQ is raised on a !active CPU, there might be a pending softirq when stopping the tick which triggers a warning in NOHZ code. The SCHED_SOFTIRQ can also be raised by the scheduler tick which has the same issue. Therefore remove the CPU from nohz.idle_cpus_mask when it is marked inactive and also prevent the scheduler_tick() from raising SCHED_SOFTIRQ after this point. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20201215104400.9435-1-anna-maria@linutronix.de
-
Viresh Kumar authored
Several parts of the kernel are already using the effective CPU utilization (as seen by the scheduler) to get the current load on the CPU, do the same here instead of depending on the idle time of the CPU, which isn't that accurate comparatively. This is also the right thing to do as it makes the cpufreq governor (schedutil) align better with the cpufreq_cooling driver, as the power requested by cpufreq_cooling governor will exactly match the next frequency requested by the schedutil governor since they are both using the same metric to calculate load. This was tested on ARM Hikey6220 platform with hackbench, sysbench and schbench. None of them showed any regression or significant improvements. Schbench is the most important ones out of these as it creates the scenario where the utilization numbers provide a better estimate of the future. Scenario 1: The CPUs were mostly idle in the previous polling window of the IPA governor as the tasks were sleeping and here are the details from traces (load is in %): Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=203 load={{0x35,0x1,0x0,0x31,0x0,0x0,0x64,0x0}} dynamic_power=1339 New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=600 load={{0x60,0x46,0x45,0x45,0x48,0x3b,0x61,0x44}} dynamic_power=3960 Here, the "Old" line gives the load and requested_power (dynamic_power here) numbers calculated using the idle time based implementation, while "New" is based on the CPU utilization from scheduler. As can be clearly seen, the load and requested_power numbers are simply incorrect in the idle time based approach and the numbers collected from CPU's utilization are much closer to the reality. Scenario 2: The CPUs were busy in the previous polling window of the IPA governor: Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=800 load={{0x64,0x64,0x64,0x64,0x64,0x64,0x64,0x64}} dynamic_power=5280 New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=708 load={{0x4d,0x5c,0x5c,0x5b,0x5c,0x5c,0x51,0x5b}} dynamic_power=4672 As can be seen, the idle time based load is 100% for all the CPUs as it took only the last window into account, but in reality the CPUs aren't that loaded as shown by the utilization numbers. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lkml.kernel.org/r/9c255c83d78d58451abc06848001faef94c87a12.1607400596.git.viresh.kumar@linaro.org
-
Viresh Kumar authored
There is nothing schedutil specific in schedutil_cpu_util(), rename it to effective_cpu_util(). Also create and expose another wrapper sched_cpu_util() which can be used by other parts of the kernel, like thermal core (that will be done in a later commit). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/db011961fb3bb8bef1c0eda5cd64564637d3ef31.1607400596.git.viresh.kumar@linaro.org
-
Viresh Kumar authored
There is nothing schedutil specific in schedutil_cpu_util(), move it to core.c and define it only for CONFIG_SMP. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/c921a362c78e1324f8ebc5aaa12f53e309c5a8a2.1607400596.git.viresh.kumar@linaro.org
-
- 03 Jan, 2021 1 commit
-
-
Linus Torvalds authored
-
- 02 Jan, 2021 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds authored
Pull s390 cleanups from Vasily Gorbik: "Update defconfigs and sort config select list" * tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/Kconfig: sort config S390 select list once again s390: update defconfigs
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management fixes from Rafael Wysocki: "These fix a crash in intel_pstate during resume from suspend-to-RAM that may occur after recent changes and two resource leaks in error paths in the operating performance points (OPP) framework, add a new C-states table to intel_idle and update the cpuidle MAINTAINERS entry to cover the governors too. Specifics: - Fix recently introduced crash in the intel_pstate driver that occurs if scale-invariance is disabled during resume from suspend-to-RAM due to inconsistent changes of APERF or MPERF MSR values made by the platform firmware (Rafael Wysocki). - Fix a memory leak and add a missing clk_put() in error paths in the OPP framework (Quanyang Wang, Viresh Kumar). - Add new C-states table for SnowRidge processors to the intel_idle driver (Artem Bityutskiy). - Update the MAINTAINERS entry for cpuidle to make it clear that the governors are covered by it too (Lukas Bulwahn)" * tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: intel_idle: add SnowRidge C-state table cpufreq: intel_pstate: Fix fast-switch fallback path opp: Call the missing clk_put() on error opp: fix memory leak in _allocate_opp_table MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
-
Rafael J. Wysocki authored
* pm-cpufreq: cpufreq: intel_pstate: Fix fast-switch fallback path * pm-cpuidle: intel_idle: add SnowRidge C-state table MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
-
- 01 Jan, 2021 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi). The big core two fixes are for power management ("block: Do not accept any requests while suspended" and "block: Fix a race in the runtime power management code") which finally sorts out the resume problems we've occasionally been having. To make the resume fix, there are seven necessary precursors which effectively renames REQ_PREEMPT to REQ_PM, so every "special" request in block is automatically a power management exempt one. All of the non-PM preempt cases are removed except for the one in the SCSI Parallel Interface (spi) domain validation which is a genuine case where we have to run requests at high priority to validate the bus so this becomes an autopm get/put protected request" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits) scsi: cxgb4i: Fix TLS dependency scsi: ufs: Un-inline ufshcd_vops_device_reset function scsi: ufs: Re-enable WriteBooster after device reset scsi: ufs-mediatek: Use correct path to fix compile error scsi: mpt3sas: Signedness bug in _base_get_diag_triggers() scsi: block: Do not accept any requests while suspended scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE scsi: scsi_transport_spi: Set RQF_PM for domain validation commands scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT scsi: ide: Do not set the RQF_PREEMPT flag for sense requests scsi: block: Introduce BLK_MQ_REQ_PM scsi: block: Fix a race in the runtime power management code scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff() scsi: ufs-pci: Fix restore from S4 for Intel controllers scsi: ufs-mediatek: Keep VCC always-on for specific devices scsi: ufs: Allow regulators being always-on scsi: ufs: Clear UAC for RPMB after ufshcd resets ...
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: "Two minor block fixes from this last week that should go into 5.11: - Add missing NOWAIT debugfs definition (Andres) - Fix kerneldoc warning introduced this merge window (Randy)" * tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block: block: add debugfs stanza for QUEUE_FLAG_NOWAIT fs: block_dev.c: fix kernel-doc warnings from struct block_device changes
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "A few fixes that should go into 5.11, all marked for stable as well: - Fix issue around identity COW'ing and users that share a ring across processes - Fix a hang associated with unregistering fixed files (Pavel) - Move the 'process is exiting' cancelation a bit earlier, so task_works aren't affected by it (Pavel)" * tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block: kernel/io_uring: cancel io_uring before task works io_uring: fix io_sqe_files_unregister() hangs io_uring: add a helper for setting a ref node io_uring: don't assume mm is constant across submits
-
Linus Torvalds authored
Commit 436e980e ("kbuild: don't hardcode depmod path") stopped hard-coding the path of depmod, but in the process caused trouble for distributions that had that /sbin location, but didn't have it in the PATH (generally because /sbin is limited to the super-user path). Work around it for now by just adding /sbin to the end of PATH in the depmod.sh script. Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 31 Dec, 2020 3 commits
-
-
Pavel Begunkov authored
For cancelling io_uring requests it needs either to be able to run currently enqueued task_works or having it shut down by that moment. Otherwise io_uring_cancel_files() may be waiting for requests that won't ever complete. Go with the first way and do cancellations before setting PF_EXITING and so before putting the task_work infrastructure into a transition state where task_work_run() would better not be called. Cc: stable@vger.kernel.org # 5.5+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
io_sqe_files_unregister() uninterruptibly waits for enqueued ref nodes, however requests keeping them may never complete, e.g. because of some userspace dependency. Make sure it's interruptible otherwise it would hang forever. Cc: stable@vger.kernel.org # 5.6+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Setting a new reference node to a file data is not trivial, don't repeat it, add and use a helper. Cc: stable@vger.kernel.org # 5.6+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 30 Dec, 2020 6 commits
-
-
git://github.com/ceph/ceph-clientLinus Torvalds authored
Pull ceph fixes from Ilya Dryomov: "A fix for an edge case in MClientRequest encoding and a couple of trivial fixups for the new msgr2 support" * tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client: libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE libceph: align session_key and con_secret to 16 bytes libceph: fix auth_signature buffer allocation in secure mode ceph: reencode gid_list when reconnecting
-
Artem Bityutskiy authored
Add C-state table for the SnowRidge SoC which is found on Intel Jacobsville platforms. The following has been changed. 1. C1E latency changed from 10us to 15us. It was measured using the open source "wult" tool (the "nic" method, 15us is the 99.99th percentile). 2. C1E power break even changed from 20us to 25us, which may result in less C1E residency in some workloads. 3. C6 latency changed from 50us to 130us. Measured the same way as C1E. The C6 C-state is supported only by some SnowRidge revisions, so add a C-state table commentary about this. On SnowRidge, C6 support is enumerated via the usual mechanism: "mwait" leaf of the "cpuid" instruction. The 'intel_idle' driver does check this leaf, so even though C6 is present in the table, the driver will only use it if the CPU does support it. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
Rafael J. Wysocki authored
When sugov_update_single_perf() falls back to the "frequency" path due to the missing scale-invariance, it will call cpufreq_driver_fast_switch() via sugov_fast_switch() and the driver's ->fast_switch() callback will be invoked, so it must not be NULL. However, after commit a365ab6b ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") intel_pstate sets ->fast_switch() to NULL when it is going to use intel_cpufreq_adjust_perf(), which is a mistake, because on x86 the scale-invariance may be turned off dynamically, so modify it to retain the original ->adjust_perf() callback pointer. Fixes: a365ab6b ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") Reported-by: Kenneth R. Crudup <kenny@panix.com> Tested-by: Kenneth R. Crudup <kenny@panix.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pmRafael J. Wysocki authored
Pull operating performance points (OPP) framework fixes for 5.11-rc2 from Viresh Kumar: "This contains two patches to fix freeing of resources in error paths." * 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: opp: Call the missing clk_put() on error opp: fix memory leak in _allocate_opp_table
-
Heiko Carstens authored
...and add comments at the top and bottom. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
-
Heiko Carstens authored
Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
-
- 29 Dec, 2020 11 commits
-
-
Andres Freund authored
This was missed in 021a2446. Leads to the numeric value of QUEUE_FLAG_NOWAIT (i.e. 29) showing up in /sys/kernel/debug/block/*/state. Fixes: 021a2446 Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andres Freund <andres@anarazel.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Randy Dunlap authored
Fix new kernel-doc warnings in fs/block_dev.c: ../fs/block_dev.c:1066: warning: Excess function parameter 'whole' description in 'bd_abort_claiming' ../fs/block_dev.c:1837: warning: Function parameter or member 'dev' not described in 'lookup_bdev' Fixes: 4e7b5671 ("block: remove i_bdev") Fixes: 37c3fc9a ("block: simplify the block device claiming interface") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-fsdevel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton: "16 patches Subsystems affected by this patch series: mm (selftests, hugetlb, pagecache, mremap, kasan, and slub), kbuild, checkpatch, misc, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: slub: call account_slab_page() after slab page initialization zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c lib/zlib: fix inflating zlib streams on s390 lib/genalloc: fix the overflow when size is too big kdev_t: always inline major/minor helper functions sizes.h: add SZ_8G/SZ_16G/SZ_32G macros local64.h: make <asm/local64.h> mandatory kasan: fix null pointer dereference in kasan_record_aux_stack mm: generalise COW SMC TLB flushing race comment mm/mremap.c: fix extent calculation mm: memmap defer init doesn't work as expected mm: add prototype for __add_to_page_cache_locked() checkpatch: prefer strscpy to strlcpy Revert "kbuild: avoid static_assert for genksyms" mm/hugetlb: fix deadlock in hugetlb_cow error path selftests/vm: fix building protection keys test
-
Roman Gushchin authored
It's convenient to have page->objects initialized before calling into account_slab_page(). In particular, this information can be used to pre-alloc the obj_cgroup vector. Let's call account_slab_page() a bit later, after the initialization of page->objects. This commit doesn't bring any functional change, but is required for further optimizations. [akpm@linux-foundation.org: undo changes needed by forthcoming mm-memcg-slab-pre-allocate-obj_cgroups-for-slab-caches-with-slab_account.patch] Link: https://lkml.kernel.org/r/20201110195753.530157-1-guro@fb.comSigned-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Randy Dunlap authored
In commit 11fb479f ("zlib: export S390 symbols for zlib modules"), I added EXPORT_SYMBOL()s to dfltcc_inflate.c but then Mikhail said that these should probably be in dfltcc_syms.c with the other EXPORT_SYMBOL()s. However, that is contrary to the current kernel style, which places EXPORT_SYMBOL() immediately after the function that it applies to, so move all EXPORT_SYMBOL()s to their respective function locations and drop the dfltcc_syms.c file. Also move MODULE_LICENSE() from the deleted file to dfltcc.c. [rdunlap@infradead.org: remove dfltcc_syms.o from Makefile] Link: https://lkml.kernel.org/r/20201227171837.15492-1-rdunlap@infradead.org Link: https://lkml.kernel.org/r/20201219052530.28461-1-rdunlap@infradead.org Fixes: 11fb479f ("zlib: export S390 symbols for zlib modules") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Zaslonko Mikhail <zaslonko@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ilya Leoshkevich authored
Decompressing zlib streams on s390 fails with "incorrect data check" error. Userspace zlib checks inflate_state.flags in order to byteswap checksums only for zlib streams, and s390 hardware inflate code, which was ported from there, tries to match this behavior. At the same time, kernel zlib does not use inflate_state.flags, so it contains essentially random values. For many use cases either zlib stream is zeroed out or checksum is not used, so this problem is masked, but at least SquashFS is still affected. Fix by always passing a checksum to and from the hardware as is, which matches zlib_inflate()'s expectations. Link: https://lkml.kernel.org/r/20201215155551.894884-1-iii@linux.ibm.com Fixes: 12619610 ("lib/zlib: add s390 hardware support for kernel zlib_inflate") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Shijie authored
Some graphic card has very big memory on chip, such as 32G bytes. In the following case, it will cause overflow: pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE); ret = gen_pool_add(pool, 0x1000000, SZ_32G, NUMA_NO_NODE); va = gen_pool_alloc(pool, SZ_4G); The overflow occurs in gen_pool_alloc_algo_owner(): .... size = nbits << order; .... The @nbits is "int" type, so it will overflow. Then the gen_pool_avail() will return the wrong value. This patch converts some "int" to "unsigned long", and changes the compare code in while. Link: https://lkml.kernel.org/r/20201229060657.3389-1-sjhuang@iluvatar.aiSigned-off-by: Huang Shijie <sjhuang@iluvatar.ai> Reported-by: Shi Jiasheng <jiasheng.shi@iluvatar.ai> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Josh Poimboeuf authored
Silly GCC doesn't always inline these trivial functions. Fixes the following warning: arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.comSigned-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> [build-tested] Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Huang Shijie authored
Add these macros, since we can use them in drivers. Link: https://lkml.kernel.org/r/20201229072819.11183-1-sjhuang@iluvatar.aiSigned-off-by: Huang Shijie <sjhuang@iluvatar.ai> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Randy Dunlap authored
Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and remove all arch/*/include/asm/local64.h arch-specific files since they only #include <asm-generic/local64.h>. This fixes build errors on arch/c6x/ and arch/nios2/ for block/blk-iocost.c. Build-tested on 21 of 25 arch-es. (tools problems on the others) Yes, we could even rename <asm-generic/local64.h> to <linux/local64.h> and change all #includes to use <linux/local64.h> instead. Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.orgSigned-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Walter Wu authored
Syzbot reported the following [1]: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 2d993067 P4D 2d993067 PUD 19a3c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 3852 Comm: kworker/1:2 Not tainted 5.10.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events free_ipc RIP: 0010:kasan_record_aux_stack+0x77/0xb0 Add null checking slab object from kasan_get_alloc_meta() in order to avoid null pointer dereference. [1] https://syzkaller.appspot.com/x/log.txt?x=10a82a50d00000 Link: https://lkml.kernel.org/r/20201228080018.23041-1-walter-zh.wu@mediatek.comSigned-off-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-