[PATCH] sched: add local load metrics
From: Nick Piggin <piggin@cyberone.com.au> This patch removes the per runqueue array of NR_CPU arrays. Each time we want to check a remote CPU's load we check nr_running as well anyway, so introduce a cpu_load which is the load of the local runqueue and is kept updated in the timer tick. Put them in the same cacheline. This has additional benefits of having the cpu_load consistent across all CPUs and more up to date. It is sampled better too, being updated once per timer tick. This shouldn't make much difference in scheduling behaviour, but all benchmarks are either as good or better on the 16-way NUMAQ: hackbench, reaim, volanomark are about the same, tbench and dbench are maybe a bit better. kernbench is about one percent better. John reckons it isn't a big deal, but it does save 4K per CPU or 2MB total on his big systems, so I figure it must be a bit kinder on the caches. I think it is just nicer in general anyway.
Showing
Please register or sign in to comment