- 28 Mar, 2012 7 commits
-
-
Daniel Kurtz authored
This memory is always allocated, and it is always a fixed size, so just allocate it along with the rest of the driver state. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Kurtz authored
There is no GMBUS "disabled" port 0, nor "reserved" port 7. For the other 6 ports there is a fixed 1:1 mapping between pin pairs and gmbus ports, which means every real gmbus port has a gpio pin. Given these realizations, clean up gmbus initialization. Tested on Sandybridge (gen 6, PCH == CougarPoint) hardware. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Kurtz authored
Instead of letting other modules directly access the ->gmbus array, introduce intel_gmbus_get_adapter() for looking up an i2c_adapter for a given gmbus port identifier. This will enable later refactoring of the gmbus port list. Note: Before requesting an adapter for a given gmbus port number, the driver must first check its validity using i2c_intel_gmbus_is_port_valid(). If this check fails, a call to intel_gmbus_get_adapter() will WARN_ON and return NULL. This is relevant for parts of the driver that read a port from VBIOS, which might be improperly initialized and contain an invalid port. In these cases, the driver must fall back to using a safer default port. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Kurtz authored
Instead of rolling our own custom quirk_xfer function, use the bit_algo pre_xfer and post_xfer functions to setup and teardown bit-banged i2c transactions. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Kurtz authored
According to i915 documentation [1], "Port D" (DP/HDMI Port D) is actually gmbus pin pair 6 (gmbus0.2:0 == 110b GPIOF), not 7 (111b). Pin pair 7 is a reserved pair. [1] Documentation for [DevSNB+] and [DevIBX], as found on http://intellinuxgraphics.org: [DevSNB+]: http://intellinuxgraphics.org/documentation/SNB/IHD_OS_Vol3_Part3.pdf Section 2.2.2 lists the 6 gmbus ports (gpio pin pairs): [ 5: HDMI/DPD, 4: HDMIB, 3: HDMI/DPC, 2: LVDS, 1: SSC, 0: VGA ] 2.2.2.1 lists the GPIO registers to control these 6 ports. 2.2.3.1 lists the mapping between 5 of these gmbus ports and the 3 Pin_Pair_Select bits (of the GMBUS0 register). This table is missing HDMIB (port 101). [DevIBX]: http://intellinuxgraphics.org/IHD_OS_Vol3_Part3r2.pdf Section 2.2.2 lists the same 6 gmbus ports plus two 'reserved' gpio ports. 2.2.2.1 lists 8 GPIO registers... however, it says the size of the block is 6x32, which implies that those 2 reserved GPIO registers (GPIO_6 & GPIO_7) don't actually exist (or are irrelevant). 2.2.3.1 lists the mapping between the 6 named gmbus ports and the 3 Pin_Pair_Select bits (of the GMBUS0 register). This table has HDMIB. Note: the "reserved" and "disabled" pairs do not actually map to a physical pair of pins, nor GPIO regs and shouldn't be initialized or used. Fixing this is left for a later patch. This bug had not been noticed earlier for two reasons: 1) Until recently, "gmbus" mode was disabled - all transfers actually used "bit-bang" mode on GPIO port 5 (the "HDMI/DPD CTLDATA/CLK" pair), at register 0x5024 (defined as GPIOF i915_reg.h). Since this is the correct pair of pins for HDMI1, transfers succeed. 2) Even if gmbus mode is re-enabled, the first attempted transaction will fail because it tries to use the wrong ("Reserved") pin pair. However, the driver immediately falls back again to the bit-bang method, which correctly uses GPIOF, so again, transfers succeed. However, if gmbus mode is re-enabled and the GPIO fall-back mode is disabled, then reading an attached monitor's EDID fail. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Kurtz authored
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Kurtz authored
Split out gmbus_xfer_read/write() helper functions. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 27 Mar, 2012 21 commits
-
-
Daniel Vetter authored
Beside helping the compiler untangle this maze they double-up as documentation for which parts of the code aren't performance-critical but just around to keep old (but already dead-slow) userspace from breaking. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
The issue is that with inline clflushing the clflushing isn't properly swizzled. Fix this by - always clflushing entire 128 byte chunks and - unconditionally flush before writes when swizzling a given page. We could be clever and check whether we pwrite a partial 128 byte chunk instead of a partial cacheline, but I've figured that's not worth it. Now the usual approach is to fold this into the original patch series, but I've opted against this because - this fixes a corner case only very old userspace relies on and - I'd like to not invalidate all the testing the pwrite rewrite has gotten. This fixes the regression notice by tests/gem_tiled_partial_prite_pread from i-g-t. Unfortunately it doesn't fix the issues with partial pwrites to tiled buffers on bit17 swizzling machines. But that is also broken without the pwrite patches, so likely a different issue (or a problem with the testcase). v2: Simplify the patch by dropping the overly clever partial write logic for swizzled pages. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes. Add new functions in filemap.h to make that possible. Also kill a copy&pasted spurious space in both functions while at it. v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed. v3: Becaus I couldn't find a way around adding a uaddr += PAGE_SIZE to the filemap.c hotpaths (that the compiler couldn't remove again), let's go with separate new functions for the multipage use-case. v4: Adjust comment to CodingStlye and fix spelling. Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
While moving around things, this two functions slowly grew out of any sane bounds. So extract a few lines that do the copying and clflushing. Also add a few comments to explain what's going on. v2: Again do s/needs_clflush/needs_clflush_after/ in the write paths as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
It's around 20% faster. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
It's too expensive to move it around just for that pwrite, especially when we're trashing on the mappable gtt part like crazy. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
In micro-benchmarking of the usual pwrite use-pattern of alternating pwrites with gtt domain reads from the gpu, this yields around 30% improvement of pwrite throughput across all buffers size. The trick is that we can avoid clflush cachelines that we will overwrite completely anyway. Furthermore for partial pwrites it gives a proportional speedup on top of the 30% percent because we only clflush back the part of the buffer we're actually writing. v2: Simplify the clflush-before-write logic, as suggested by Chris Wilson. v3: Finishing touches suggested by Chris Wilson: - add comment to needs_clflush_before and only set this if the bo is uncached. - s/needs_clflush/needs_clflush_after/ in the write paths to clearly differentiate it from needs_clflush_before. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
The pagemap.h prefault helpers do the prefaulting by simply writing some data into every page. Hence we should not prefault when we're not yet commited to to actually writing data to userspace. The problem is now that - we can't prefault while holding dev->struct_mutex for we could deadlock with our own pagefault handler - we need to grab dev->struct_mutex before copying to sync up with any outsanding gpu writes. Therefore only prefault when we're dropping the lock the first time in the pread slowpath - at that point we're committed to the write, don't wait on the gpu anymore and hence won't return early (with e.g. -EINTR). Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
With the proper prefault, it's extremely unlikely that we fall back to the gtt slowpath. So just kill it and use the shmem_pwrite path as fallback. To further clean up the code, move the preparatory gem calls into the respective pwrite functions. This way the gtt_fast->shmem fallback is much more obvious. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
This speeds up pwrite and pread from ~120 µs ro ~100 µs for reading/writing 1mb on my snb (if the backing storage pages are already pinned, of course). v2: Chris Wilson pointed out a glaring page reference bug - I've unconditionally dropped the reference. With that fixed (and the associated reduction of dirt in dmesg) it's now even a notch faster. v3: Unconditionaly grab a page reference when dropping dev->struct_mutex to simplify the code-flow. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
~120 µs instead fo ~210 µs to write 1mb on my snb. I like this. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
No longer needed. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
This is obviously gonna slow down pread. But for a half-way realistic micro-benchmark, it doesn't matter: Non-broken userspace reads back data from the gpu once before the gpu again dirties it. So all this ranged clflush tracking is just a waste of time. No pread performance change (neglecting the dumb benchmark of constantly reading the same data) measured. As an added bonus, this avoids clflush on read on coherent objects. Which means that partial preads on snb are now roughly 4x as fast. This will be usefull for e.g. the libva encoder - when I finally get around to fix that up. v2: Properly sync with the gpu on LLC machines. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Useful when the page is already mapped to copy date in/out. For -stable because the next patch (fixing phys obj pwrite) needs this little helper function. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
We try to avoid writing the relocations through the uncached GTT, if the buffer is currently in the CPU write domain and so will be flushed out to main memory afterwards anyway. Also on SandyBridge we can safely write to the pages in cacheable memory, so long as the buffer is LLC mapped. In either of these cases, we therefore do not need to force the reallocation of the buffer into the mappable region of the GTT, reducing the aperture pressure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
We've lost our guard page somewhere in the gtt rewrite, this patch here will restore it. Exercised by i-g-t/tests/gem_cs_prefetch. v2: Substract the guard page from the range we're supposed to manage with gem. Suggested by Chris Wilson to increase the odds of old ums + gem userspace not blowing up. To compensate for the loss of a page, don't substract the guard page in the modeset init code any longer. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44748Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
So don't call it like that. Also rip out a confusing comment and instead explain what's really going on. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
... because this is what it actually doesn now that we have the global gtt vs. ppgtt split. Also move it to the other global gtt functions in i915_gem_gtt.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
This reverts commmit d4b74bf0 which reverted the origin fix fb8b5a39. We have at least 3 different bug reports that this fixes things and no indication what is exactly wrong with this. So try again. To make matters slightly more fun, the commit itself was cc: stable whereas the revert has not been. According to Peter Clifton he discussed this with Zhao Yakui and this seems to be in contradiction of the GM45 PRM, but rumours have it that this is how the BIOS does it ... let's see. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Tested-by: Peter Clifton <Peter.Clifton@clifton-electronics.com> Cc: Zhao Yakui <yakui.zhao@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Eric Anholt <eric@anholt.net> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16236 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25913 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=14792Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 26 Mar, 2012 1 commit
-
-
Chris Wilson authored
Originally the code tried to allocate a large enough array to perform the copy using vmalloc, performance wasn't great and throughput was improved by processing each individual relocation entry separately. This too is not as efficient as one would desire. A compromise would be to allocate a single page, or to allocate a few entries on the stack, and process the copy in batches. The latter gives simpler code and more consistent performance due to a lack of heuristic. x11perf -copywinwin10: n450/pnv i3-330m i5-2520m (cpu) before: 249000 785000 1280000 (80%) page: 264000 896000 1280000 (65%) on-stack: 264000 902000 1280000 (67%) v2: Use 512-bytes of stack for batching rather than allocate a page. v3: Tidy the code slightly with more descriptive variable names Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 25 Mar, 2012 2 commits
-
-
Daniel Vetter authored
With the recent set of gmbus fixes, this seems to work on my i855gm. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Again, Valleyview modes these around, so make the mmio base more explicit to consolidate the base address computations to one HAS_PCH_SPLIT check. v2: Fix up the PCH_SPLIT braino ... it actually works that way round. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 24 Mar, 2012 1 commit
-
-
Daniel Vetter authored
With valleyview we'll have these at yet another address, so keeping track of this with an ever-growing list of registers will get ugly. This way intel_sdvo.c is fully independent of the base address of the output ports display register blocks. While at it, do 2 closely related cleanups: - use SDVO_NAME some more - change the sdvo_reg variables to uint32_t like other registers. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 23 Mar, 2012 3 commits
-
-
Jesse Barnes authored
They were all over the place, order them by position and add a few. v2: add gen indications to the new bits (Ben) Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Jesse Barnes authored
It's only used by the main read/write functions, so we can keep it with them. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
If we discard a buffer due to memory pressure, also release its alloted mmap address space. As it may be sometime before userspace wakes up and notices that it has buffers to purge from its cache, we may waste valuable address space on unusable objects for a period of time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47738Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 22 Mar, 2012 4 commits
-
-
Takashi Iwai authored
Add a new module optoin lvds_channel to specify the LVDS channel mode explicitly instead of probing the LVDS register value set by BIOS. This will be helpful when VBT is broken or incompatible with the current code. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42842Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Takashi Iwai authored
Currently i915 driver checks [PCH_]LVDS register bits to decide whether to set up the dual-link or the single-link mode. This relies implicitly on that BIOS initializes the register properly at boot. However, BIOS doesn't initialize it always. When the machine is booted with the closed lid, BIOS skips the LVDS reg initialization. This ends up in blank output on a machine with a dual-link LVDS when you open the lid after the boot. This patch adds a workaround for that problem by checking the initial LVDS register value in VBT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37742Tested-By: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ben Widawsky authored
Introduced in commits c1cd90ed and d27b1e0eSigned-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: s/fix/shut up in the commit msg and add a comment to the BUG_ON.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ben Widawsky authored
Introduced in commit 8461d226 and 8c59967cSigned-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: s/fix/shut up/ in the commit msg.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 Mar, 2012 1 commit
-
-
Daniel Vetter authored
On Sanybridge a few MI read/write commands only work when ppgtt is enabled. Userspace therefore needs to be able to check whether ppgtt is enabled. For added hilarity, you need to reset the "use global GTT" bit on snb when ppgtt is enabled, otherwise it won't work. Despite what bspec says about automatically using ppgtt ... Luckily PIPE_CONTROL (the only write cmd current userspace uses) is not affected by all this, as tested by tests/gem_pipe_control_store_loop. Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-