- 05 Jan, 2017 6 commits
-
-
Chris Wilson authored
Missed when rebasing patches, I failed to set ret to zero before starting the unbind loop (which depends upon ret being zero). Reported-by: Matthew Auld <matthew.william.auld@gmail.com> Fixes: 9332f3b1 ("drm/i915: Combine loops within i915_gem_evict_something") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105155940.10033-1-chris@chris-wilson.co.ukReviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v4.9+
-
Chris Wilson authored
The GuC uses a special mapping for the upper end of the Global GTT, similar to the way it uses a special mapping for the lower end, so exclude it from our drm_mm to prevent us using it. v2: Rename to reflect that it is unmappable similar to the region at the bottom of the GGTT, and couple it into the assertion that we don't feed unmappable addresses to the GuC. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-5-chris@chris-wilson.co.uk
-
Chris Wilson authored
In order to defeat some circular dependencies between headers to allow use of e.g. range_overflows() in a header, move the simple independent macros into their own header. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-4-chris@chris-wilson.co.uk
-
Chris Wilson authored
Empirically we restart following a GPU reset more successfully if we call lrc_init_hws() (which contains a posting read) last. (The failure mode that was observed was that breadcrumb writes into the HWS from the recovered requests went astray leading to the context-switch maintaining forward progress, but the requests not being retired/completed.) For clarity, lrc_init_hws() is inlined (and the unused function then removed). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-3-chris@chris-wilson.co.uk
-
Chris Wilson authored
In order to convince static analyzers that the allocation function returns an error or sets ce->state, assert that it is set afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-2-chris@chris-wilson.co.uk
-
Chris Wilson authored
During i915_gem_timeline_fini(), assert that all the timeline's request are completed and removed from the timeline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-1-chris@chris-wilson.co.uk
-
- 04 Jan, 2017 8 commits
-
-
Chris Wilson authored
The fence registers are clobbered by a GPU reset. If there is concurrent user access to a fenced region via a GTT mmaping, the access will not be fenced during the reset (until we restore the fences afterwards). In order to prevent invalid access during the reset, before we clobber the fences first we must invalidate the GTT mmapings. Access to the mmap will then be forced to fault in the page, and in handling the fault, i915_gem_fault() will take the struct_mutex and wait upon the reset to complete. v2: Fix up commentary. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99274 Testcase: igt/gem_mmap_gtt/hang Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170104145110.1486-1-chris@chris-wilson.co.ukReviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
-
Paulo Zanoni authored
Gen9+ platforms have been seeing a lot of screen flickerings and underruns, so I never felt comfortable in enabling FBC on these platforms since I didn't want to throw yet another feature on top of the already complex problem. We now have code that automatically disables FBC if we ever get an underrun, and the screen flickerings seem to be mostly gone, so it may be a good time to try to finally enable FBC by default on the newer platforms. Besides, BDW FBC has been working fine over the year, which gives me a little more confidence now. For a little more information, please refer to commit a98ee793 ("drm/i915/fbc: enable FBC by default on HSW and BDW"). v2: Enable not only on SKL, but for everything new (Daniel). v3: Rebase after the intel_sanitize_fbc_option() change. v4: New rebase after 8 months, drop expired R-B tags. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1482495839-27041-1-git-send-email-paulo.r.zanoni@intel.com
-
Paulo Zanoni authored
Back in 2014, commit fb7023e0 ("drm/i915: BDW: Adding Reserved PCI IDs.") added the reserved PCI IDs in order to try to make sure we had working drivers in case we ever released products using these IDs (since we had instances of this type of problem in the past). The problem is that the patch only touched the macros used by early-quirks.c and by the user space components that rely on i915_pciids.h, it didn't touch the macros used by i915_pci.c. So we correctly handled the stolen memory for these theoretical IDs, but we didn't actually drive the devices from i915.ko. So this patch fixes the original commit by actually making i915.ko drive these IDs, which was the goal. There's no information on what would be the GT count on these IDs, so we just go with the safer intel_broadwell_info, at the risk of ignoring a possibly inexistent BSD2_RING. I did some checking, and it seems that these IDs are driven by intel-gpu-tools, xf86-video-intel and libdrm (since they contain old copies of i915_pciids.h), but they are not checked by mesa. The alternative to this patch would be to just assume we're actually never going to use these IDs, and then remove them from our ID lists and make sure our user space components sync the latest i915_pciids.h copy. I'm fine with either approaches, as long as we make sure that every component tries to drive the same list of PCI IDs. Fixes: fb7023e0 ("drm/i915: BDW: Adding Reserved PCI IDs.") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1483473860-17644-3-git-send-email-paulo.r.zanoni@intel.com
-
Paulo Zanoni authored
Commit 8d9c20e1 ("drm/i915: Remove .is_mobile field from platform struct") removed mobile vs desktop differences for HSW+, but forgot the Broadwell reserved IDs, so do it now. It's interesting to notice that these IDs are used by early-quirks.c but are *not* used by i915_pci.c. Cc: Carlos Santa <carlos.santa@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1483473860-17644-2-git-send-email-paulo.r.zanoni@intel.com
-
Paulo Zanoni authored
Remove duplicated IDs from the list. Currently, this definition is only used by early-quirks.c. From my understanding of the code, having duplicated IDs shouldn't be causing any bugs. Fixes: 8d9c20e1 ("drm/i915: Remove .is_mobile field from platform struct") Cc: Carlos Santa <carlos.santa@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1483473860-17644-1-git-send-email-paulo.r.zanoni@intel.com
-
Daniel Vetter authored
Merge tag 'drm-misc-next-2016-12-30' of git://anongit.freedesktop.org/git/drm-misc into drm-intel-next-queued Directly merge drm-misc into drm-intel since Dave is on vacation and we need the various drm-misc patches (fb format rework, drm mm fixes, selftest framework and others). Also pulled back -rc2 in first to resync with drm-intel-fixes and make sure I can reuse the exact rerere solutions from drm-tip for safety, and because I'm lazy. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
-
Daniel Vetter authored
Backmerge Linux 4.10-rc2 to resync with our -fixes cherry-picks. I've done the backmerge directly because Dave is on vacation. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
-
Daniel Vetter authored
The code was moved, but the comment not updated. It confused me. Fixes: 7f4c6284 ("drm/i915: Assign hwmode after encoder state readout") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161219082423.27798-6-daniel.vetter@ffwll.ch
-
- 03 Jan, 2017 4 commits
-
-
Rodrigo Vivi authored
No functional changes. Apparently spec has been changed the valid table showing 0x192A as Server GT4 while 0x193A is Server GT4e. Libdrm and Mesa already have this right. So let's fix the ref here. Cc: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1483471672-10450-1-git-send-email-rodrigo.vivi@intel.com
-
Ander Conselvan de Oliveira authored
After commit 1c74eeaf ("drm/i915: Move number of scalers initialization to runtime init"), scalers are not initialized properly for skl and glk since num_scalers is left as 0 for those platforms. Fixes: 1c74eeaf ("drm/i915: Move number of scalers initialization to runtime init") Cc: Nabendu Maiti <nabendu.bikash.maiti@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> (v2) Cc: Ander Conselvan de Oliveira <conselvan2@gmail.com> Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1483365281-10569-1-git-send-email-ander.conselvan.de.oliveira@intel.com
-
Chris Wilson authored
Ville explained that the wakelock was being acquired during set-idle in order to flush the voltage change from the punit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170102152845.32352-1-chris@chris-wilson.co.ukReviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
-
Michal Wajdeczko authored
This function is only used by intel_guc_send() and it doesn't need to be exposed outside of intel_uc.o file. Also when defined as static, compiler will generate smaller code. Additionally let it take guc param instead dev_priv to match function name. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161220115531.76120-1-michal.wajdeczko@intel.com
-
- 02 Jan, 2017 2 commits
-
-
Nabendu Maiti authored
In future patches, we require greater flexibility in describing the number of scalers available on each CRTC. To ease that transition we move the current assignment to intel_device_info. Scaler structure initialisation is done if scaler is available on the CRTC. Gen9 check is not required as on depending upon numbers of scalers we initialize scalers or return without doing anything in skl_init_scalers. v3: Changed skl_init_scaler to intel_crtc_init_scalers v2: Added Chris's comments. Signed-off-by: Nabendu Maiti <nabendu.bikash.maiti@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1480398794-22741-1-git-send-email-nabendu.bikash.maiti@intel.com
-
Ander Conselvan de Oliveira authored
The function intel_atomic_get_shared_dpll_state() is only called from intel_dpll_mgr.c and it concerns the same data structures as the other functions in that file, so move it there and make it static. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1483024933-3726-8-git-send-email-ander.conselvan.de.oliveira@intel.com
-
- 01 Jan, 2017 2 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds authored
Pull DAX updates from Dan Williams: "The completion of Jan's DAX work for 4.10. As I mentioned in the libnvdimm-for-4.10 pull request, these are some final fixes for the DAX dirty-cacheline-tracking invalidation work that was merged through the -mm, ext4, and xfs trees in -rc1. These patches were prepared prior to the merge window, but we waited for 4.10-rc1 to have a stable merge base after all the prerequisites were merged. Quoting Jan on the overall changes in these patches: "So I'd like all these 6 patches to go for rc2. The first three patches fix invalidation of exceptional DAX entries (a bug which is there for a long time) - without these patches data loss can occur on power failure even though user called fsync(2). The other three patches change locking of DAX faults so that ->iomap_begin() is called in a more relaxed locking context and we are safe to start a transaction there for ext4" These have received a build success notification from the kbuild robot, and pass the latest libnvdimm unit tests. There have not been any -next releases since -rc1, so they have not appeared there" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: ext4: Simplify DAX fault path dax: Call ->iomap_begin without entry lock during dax fault dax: Finish fault completely when loading holes dax: Avoid page invalidation races and unnecessary radix tree traversals mm: Invalidate DAX radix tree entries only if appropriate ext2: Return BH_New buffers for zeroed blocks
-
- 31 Dec, 2016 4 commits
-
-
Chris Wilson authored
DRRS is not yet kerneldoc despite the allusion prior to enum drrs_refresh_rate_type. Drop the '**' to avoid the warnings from make htmldocs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-4-chris@chris-wilson.co.uk
-
Chris Wilson authored
The existing kerneldoc was outdated, so time for a refresh. v2: Use single line kdoc, mention functions for manipulation Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-3-chris@chris-wilson.co.uk
-
Chris Wilson authored
Parameter - no. Parameter: yes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-2-chris@chris-wilson.co.uk
-
Chris Wilson authored
The read of the page pin count and the bind count are unordered, presenting races in the assert and it firing off incorrectly. Prevent this by restricting the assert to the vma bind/unbind routines where we have local cpu ordering between the two. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161231112012.29263-1-chris@chris-wilson.co.uk
-
- 30 Dec, 2016 8 commits
-
-
git://git.lwn.net/linuxLinus Torvalds authored
Pull documentation fixes from Jonathan Corbet: "Two small fixes: - A merge error on my part broke the DocBook build. I've requisitioned one of tglx's frozen sharks for appropriate disciplinary action and resolved to be more careful about testing the DocBook stuff as long as it's still around. - Fix an error in unaligned-memory-access.txt" * tag 'docs-4.10-rc1-fix' of git://git.lwn.net/linux: Documentation/unaligned-memory-access.txt: fix incorrect comparison operator docs: Fix build failure
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
Pull crypto fix from Herbert Xu: "This fixes a boot failure on some platforms when crypto self test is enabled along with the new acomp interface" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: testmgr - Use heap buffer for acomp test input
-
Ander Conselvan de Oliveira authored
Remove the IS_PLATFORM() macros from intel_dump_pipe_config() and split that logic in platform specific implementations inside the dpll code, accessed through a platform independent interface. v2: Rebase. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1) Link: http://patchwork.freedesktop.org/patch/msgid/1483024933-3726-7-git-send-email-ander.conselvan.de.oliveira@intel.com
-
Ander Conselvan de Oliveira authored
The documentation for most of the non-static members and structs were missing. Fix that. v2: Fix typos (Durga) v3: Rebase. Fix make docs warnings. Document more. v4: capitilize CRTC; say that the prepare hook is a nop if the DPLL is already enabled; link to struct intel_dpll_hw_state from @hw_state field in struct intel_shared_dpll_state; reorganize DPLL flags; link intel_shared_dpll_state to other structs and functions. (Daniel) Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1483024933-3726-6-git-send-email-ander.conselvan.de.oliveira@intel.com
-
Ander Conselvan de Oliveira authored
The hook is called from intel_prepare_shared_dpll(). The name doesn't make sense after all the changes to modeset code. So just call it prepare. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Durgadoss R <durgadoss.r@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1483024933-3726-5-git-send-email-ander.conselvan.de.oliveira@intel.com
-
Ander Conselvan de Oliveira authored
Struct intel_shared_dpll_config is used to hold the state of the DPLL in the "atomic" sense, so call it state like everything else atomic. v2: Rebase Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1) Link: http://patchwork.freedesktop.org/patch/msgid/1483024933-3726-4-git-send-email-ander.conselvan.de.oliveira@intel.com
-
Ander Conselvan de Oliveira authored
The function intel_shared_dpll_commit() performs the equivalent of drm_atomic_helper_swap_state() for the shared dpll state, which is not handled by the helpers. So make it do a full swap of the state and rename it for consistency. v2: Fix typo in the commit message. (Durga) v3: Rebase. v4: Swap the states instead of just renaming the function. (Daniel) Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Durgadoss R <durgadoss.r@intel.com> (v2) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v3) Link: http://patchwork.freedesktop.org/patch/msgid/1483024933-3726-3-git-send-email-ander.conselvan.de.oliveira@intel.com
-
Ander Conselvan de Oliveira authored
While the details of getting a shared dpll are wrapped by intel_get_shared_dpll(), the release was still hand rolled into the modeset code. Fix that by creating an entry point for releasing the pll and move that code there. v2: Take old_dpll from crtc->state instead of crtc_state. (CI) Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1483024933-3726-2-git-send-email-ander.conselvan.de.oliveira@intel.com
-
- 29 Dec, 2016 2 commits
-
-
Olof Johansson authored
mm/filemap.c: In function 'clear_bit_unlock_is_negative_byte': mm/filemap.c:933:9: error: too few arguments to function 'test_bit' return test_bit(PG_waiters); ^~~~~~~~ Fixes: b91e1302 ('mm: optimize PageWaiters bit use for unlock_page()') Signed-off-by: Olof Johansson <olof@lixom.net> Brown-paper-bag-by: Linus Torvalds <dummy@duh.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
In commit 62906027 ("mm: add PageWaiters indicating tasks are waiting for a page bit") Nick Piggin made our page locking no longer unconditionally touch the hashed page waitqueue, which not only helps performance in general, but is particularly helpful on NUMA machines where the hashed wait queues can bounce around a lot. However, the "clear lock bit atomically and then test the waiters bit" sequence turns out to be much more expensive than it needs to be, because you get a nasty stall when trying to access the same word that just got updated atomically. On architectures where locking is done with LL/SC, this would be trivial to fix with a new primitive that clears one bit and tests another atomically, but that ends up not working on x86, where the only atomic operations that return the result end up being cmpxchg and xadd. The atomic bit operations return the old value of the same bit we changed, not the value of an unrelated bit. On x86, we could put the lock bit in the high bit of the byte, and use "xadd" with that bit (where the overflow ends up not touching other bits), and look at the other bits of the result. However, an even simpler model is to just use a regular atomic "and" to clear the lock bit, and then the sign bit in eflags will indicate the resulting state of the unrelated bit #7. So by moving the PageWaiters bit up to bit #7, we can atomically clear the lock bit and test the waiters bit on x86 too. And architectures with LL/SC (which is all the usual RISC suspects), the particular bit doesn't matter, so they are fine with this approach too. This avoids the extra access to the same atomic word, and thus avoids the costly stall at page unlock time. The only downside is that the interface ends up being a bit odd and specialized: clear a bit in a byte, and test the sign bit. Nick doesn't love the resulting name of the new primitive, but I'd rather make the name be descriptive and very clear about the limitation imposed by trying to work across all relevant architectures than make it be some generic thing that doesn't make the odd semantics explicit. So this introduces the new architecture primitive clear_bit_unlock_is_negative_byte(); and adds the trivial implementation for x86. We have a generic non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)" combination) which can be overridden by any architecture that can do better. According to Nick, Power has the same hickup x86 has, for example, but some other architectures may not even care. All these optimizations mean that my page locking stress-test (which is just executing a lot of small short-lived shell scripts: "make test" in the git source tree) no longer makes our page locking look horribly bad. Before all these optimizations, just the unlock_page() costs were just over 3% of all CPU overhead on "make test". After this, it's down to 0.66%, so just a quarter of the cost it used to be. (The difference on NUMA is bigger, but there this micro-optimization is likely less noticeable, since the big issue on NUMA was not the accesses to 'struct page', but the waitqueue accesses that were already removed by Nick's earlier commit). Acked-by: Nick Piggin <npiggin@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 28 Dec, 2016 4 commits
-
-
Chris Wilson authored
A couple of parameters slipped through the kerneldoc net. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161228105120.14500-1-chris@chris-wilson.co.uk
-
Daniel Vetter authored
Drivers need to take care. Motivated by a discussion between Mark and Rob on dri-devel. Cc: Mark yao <mark.yao@rock-chips.com> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: s/alloc|freeing/modifications/ per Chris' suggestion.] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1482833457-29592-1-git-send-email-daniel.vetter@ffwll.ch
-
Chris Wilson authored
Remove a superfluous helper as drm_mm_insert_node is equivalent to insert_node_in_range with a range of [0, U64_MAX]. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161222083641.2691-37-chris@chris-wilson.co.uk
-
Chris Wilson authored
mm->color_adjust() compares the hole with its neighbouring nodes. They only abutt before we restrict the hole, so we have to apply color_adjust before we apply the range restriction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161222083641.2691-36-chris@chris-wilson.co.uk
-