- 26 Jul, 2012 4 commits
-
-
Daniel Vetter authored
We believe to have squashed all issues around the gen6+ rps interrupt generation and why the gpu sometimes got stuck. With that cleared up, there's no user left for the sanitize_pm infrastructure, so let's just rip it out. Note that 'intel_reg_write 0xa014 0x13070000' is the w/a if we find ourselves stuck again. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
The power docs say that when the gt leaves rc6, it is in the lowest frequency and only about 25 usec later will switch to the frequency selected in GEN6_RPNSWREQ. If the downclock limit expires in that window and the down limit is set to the lowest possible frequency, the hw will not send the down interrupt. Which leads to a too high gpu clock and wasted power. Chris Wilson already worked on this with commit 7b9e0ae6 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Apr 28 08:56:39 2012 +0100 drm/i915: Always update RPS interrupts thresholds along with frequency but got the logic inverted: The current code set the down limit as long as we haven't reached it. Instead of only once with reached the lowest frequency. Note that we can't always set the downclock limit to 0, because otherwise the hw will keep on bugging us with downclock request irqs once the lowest level is reached. For similar reasons also always set the upclock limit, otherwise the hw might poke us again with interrupts. v2: Chris Wilson noticed that the limit reg is also computed in sanitize_pm. To avoid duplication, extract the code into a common function. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
By selecting the cache level (essentially whether or not the CPU snoops any updates to the bo, and on more recent machines whether it resides inside the CPU's last-level-cache) a userspace driver is able to then manage all of its memory within buffer objects, if it so desires. This enables the userspace driver to accelerate uploads and more importantly downloads from the GPU and to able to mix CPU and GPU rendering/activity efficiently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added code comment about where we plan to stuff platform specific cacheing control bits in the ioctl struct.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Several functions of the GPU have the restriction that differing memory domains cannot be placed next to each other (as the GPU may prefetch beyond the end of one domain and hang as it crosses into the other domain). We use the facility of the drm_mm to mark ranges with a particular color that corresponds to the cache attributes of those pages in order to prevent allocating adjacent blocks of differing memory types. v2: Rebase ontop of drm_mm coloring v2. v3: Fix rebinding existing gtt_space and add a verification routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 25 Jul, 2012 35 commits
-
-
Ben Widawsky authored
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ben Widawsky authored
Originally I had a macro specifically for DPF support, and Daniel, with good reason asked me to change it to this. It's not the way I would have gone (and indeed I didn't), but for now there is no distinction as all platforms with L3 also have DPF. Note: The good reasons are that dpf is a l3$ feature (at least on currrent hw), hence I don't expect one to go without the other. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: added note] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ben Widawsky authored
Basic context support on HSW is no different than previous generations. The size of the context object changes, but that's about it. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
As suggested by Daniel, rip out the independent timers for device and crtc busyness and integrate the manual powermanagement of the display engine into the GEM core and its request tracking. The benefits are that the code is a lot smaller, fewer moving parts and should fit more neatly into the overall activity tracking of the driver. v2: Complete overhaul and removal of the racy timers and workers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Otherwise once we use the buffer with a BLT command on gen2/3, we will always regard future command submissions as continuing the fenced access. However, now that we flush/invalidate between every batch we can drop this pessimism. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Now that we unconditionally flush and invalidate between every batch buffer, we no longer need the complex logic to decide which domains require flushing. Remove it and rejoice. v2 (danvet): Keep around the flip waiting logic. It's gross and broken, I know, but we can't just kill that thing ... even if we just keep it around as a reminder that things are broken. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Rely instead on the insertion of the implicit flush before the seqno breadcrumb. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
As the flush is either performed explictly immediately after the execbuffer dispatch, or before the serialisation of last_fenced_seqno we can forgo the explict i915_gem_flush_ring(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
As we guarantee to emit a flush before emitting the breadcrumb or the next batchbuffer, there is no further need for the flushing list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
As we always flush the GPU cache prior to emitting the breadcrumb, we no longer have to worry about the deferred flush causing the pending_gpu_write to be delayed. So we can instead utilize the known last_write_seqno to hopefully minimise the wait times. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
As we move to lazily clearing the GPU write domain only when the buffer becomes inactive, this leaves a window of opportunity for i915_gem_object_pin_to_display_plane() to detect a seemingly inconsistent value. This function is special as it tries to pipeline the operation to avoid the stall and so may not retires the buffer and we may not get the opportunity to clear the write domain. However, we know all is good, so drop the assertion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Request preallocation was added to i915_add_request() in order to support the overlay. However, not all users care and can quite happily ignore the failure to allocate the request as they will simply repeat the request in the future. By pushing the allocation down into i915_add_request(), we can then remove some rather ugly error handling in the callers. v2: Nullify request->file_priv otherwise we chase a garbage pointer when retiring requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
With the base addresses shifting around, this is easier to handle. Also move to the real reg offset on vlv. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Will be used more in the next patch. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
The intention is to help select which engine to use for copies with interoperating clients - such as a GL client making a request to the X server to perform a SwapBuffers, which may require copying from the active GL back buffer to the X front buffer. We choose to report a mask of the active rings to future proof the interface against any changes which may allow for the object to reside upon multiple rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: bikeshed away the write ring mask and add the explanation Chris sent in a follow-up mail why we decided to use masks.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ben Widawsky authored
The interface's immediate purpose is to do synchronous timestamp queries as required by GL_TIMESTAMP. The GPU has a register for reading the timestamp but because that would normally require root access through libpciaccess, the IOCTL can provide this service instead. Currently the implementation whitelists only the render ring timestamp register, because that is the only thing we need to expose at this time. v2: make size implicit based on the register offset Add a generation check Reviewed-by: Eric Anholt <eric@anholt.net> Cc: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: fixup the ioctl numerb:] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
I'm planing to merge this next week for 3.7, but I'd like to avoid stupid conflicts with the exsting userspace when merging the new reg_read ioctl (which doesn't have userspace yet, but this caching interface has). Header extracted from Chris Wilson's patch, but fix up the copy&pasted comment in the interface struct. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Thomas Richter authored
This patch adds support for the ns2501 DVO, found in some older Fujitsu/Siemens Labtops. It is in the state of "works for me". Includes now proper DPMS support. Includes switching between resolutions - from 640x480 to 1024x768. Currently assumes that the native display resolution is 1024x768. The ns2501 seems to be rather critical - if the output PLL is not running, the chip doesn't seem to be clocked and then doesn't react on i2c messages. Thus, a quick'n-dirty trick ensures that the DVO is active before submitting any i2c messages to it. This is probably to be reviewed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=17902Signed-off-by: Thomas Richter <thor@math.tu-berlin.de> [danvet: fixup whitespace fail.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Paulo Zanoni authored
This will be needed for Haswell, but already has its uses here. This patch started as a small patch written patch by Shobhit Kumar, but it has changed so much that none of its original lines remain. Credits-to: Shobhit Kumar <shobhit.kumar@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Paulo Zanoni authored
We have some common code that we always run before calling intel_dp_set_link_train. This common code sets the correct training patterns to the DP variable. If we add more calls to intel_dp_set_link_train, we'll also have to duplicate this common code. So instead of repeating this code whenever we call intel_dp_set_link_train, we move the code to inside the function: now we check which training pattern we're going to set and then we set the DP register according to it. One of the side-effects of this change is that now we never forget to mask the training pattern bits before changing them. It looks like this was working before because we were first masking the bits, then writing 00, 01 and then 11. This patch also enables us to use the intel_dp_set_link_train function when disabling link training: in this case we need to avoid writing the DP_TRAINING_LANE*_SET AUX commands. As a bonus, the big intel_dp_{start,complete}_link_train functions will get smaller and a little bit easier to read. Version 2 changes: - Rewrite commit message. - Also clear the training pattern bits before changing them. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Instead of having a giant if cascade to figure this out according to the passed-in register. We could do quite a bit more cleaning up and all by using the port at more places, but I think this should be part of a bigger rework to introduce a struct intel_digital_port which would keep track of all these things. I guess this will be part of some haswell-DP-induced refactoring. For now this rips out the big cascade, which is what annoyed me so much. v2: Add port variable name back for the func decl (I've tried to trick myself below the 80 char limit). Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Intel hw only has one MUX for encoders, so outputs are either not cloneable or all in the same group of cloneable outputs. This neatly simplifies the code and allows us to ditch some ugly if cascades in the dp and hdmi init code (well, we need these if cascades for other stuff still, but that can be taken care of in follow-up patches). Note that this changes two things: - dvo can now be cloned with sdvo, but dvo is gen2 whereas sdvo is gen3+, so no problem. Note that the old code had a bug and didn't allow cloning crt with dvo (but only the other way round). - sdvo-lvds can now be cloned with sdvo-non-tv. Spec says this won't work, but the only reason I've found is that you can't use the panel-fitter (used for lvds upscaling) with anything else. But we don't use the panel fitter for sdvo-lvds. Imo this part of Bspec is a) rather confusing b) mostly as a guideline to implementors (i.e. explicitly stating what is already implicit from the spec, without always going into the details of why). So I think we can ignore this - worst case we'll get a bug report from a user with with sdvo-lvds and sdvo-tmds and have to add that special case back in. Because sdvo lvds is a bit special explain in comments why sdvo LVDS outputs can be cloned, but native LVDS and eDP can't be cloned - we use the panel fitter for the later, but not for sdvo. Note that this also uncoditionally initializes the panel_vdd work used by eDP. Trying to be clever doesn't buy us anything (but strange bugs) and this way we can kill the is_edp check. v2: Incorporate review from Paulo - Add in a missing space. - Pimp comment message to address his concerns. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Splitting them up between pch and gmch variants just makes it harder to find things. Especially since the hotplug bits are actually valid on earlier chips, too. v2: Fixed the comment as pointed out by Paulo Zanoni. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
When bug hunting, I found the interface to do_switch() overly complicated and I believe festered the earlier bug. This aims to make the code a little clearer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Shobhit Kumar authored
Move the DP structure to shared location so that it can be used from within the ddi module. Changes from Paulo: - Move less code to intel_drv.h - Remove #include statement - Replace a tab with a space in train_set Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
We now refuse to load on gen6+ if kms is not enabled: commit 26394d92 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Mar 26 21:33:18 2012 +0200 drm/i915: refuse to load on gen6+ without kms Which results in the drm core calling our lastclose function to clean up the mess, but that one is neatly broken for such failure cases since kms has been introduced in commit 79e53945 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Fri Nov 7 14:24:08 2008 -0800 DRM: i915: add mode setting support Reported-and-tested-by: Paulo Zanoni <przanoni@gmail.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Eric Anholt authored
Fixes failures in transform feedback on gen7 because our SOL_RESET flag was setting the transform feedback offsets in the old context (occasionally happened to be ours) instead of the new context. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Laurent Pinchart missed this when sending in is giant constify patch: commit e811f5ae Author: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Date: Tue Jul 17 17:56:50 2012 +0200 drm: Make the .mode_fixup() operations mode argument a const pointer Acked-by; Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
LVDS is the first output where dpms on/off and prepare/commit don't perfectly match. Now the idea behind this special case seems to be that for simple resolution changes on the LVDS we don't need to stop the pipe, because (at least on newer chips) we can adjust the panel fitter on the fly. There are a few problems with the current code though: - We still stop and restart the pipe unconditionally, because the crtc helper code isn't flexible enough. - We show some ugly flickering, especially when changing crtcs (this the crtc helper would actually take into account, but we don't implement the encoder->get_crtc callback required to make this work properly). So it doesn't even work as advertised. I agree that it would be nice to do resolution changes on LVDS (and also eDP) whithout blacking the screen where the panel fitter allows to do that. But imo we should implement this as a special case a few layers up in the mode set code, akin to how we already detect simple framebuffer changes (and only update the required registers with ->mode_set_base). Until this is all in place, make our lives easier and just rip it out. Also note that this seems to fix actual bugs with enabling the lvds output, see: http://lists.freedesktop.org/archives/intel-gfx/2012-July/018614.html Cc: Takashi Iwai <tiwai@suse.de> Cc: Giacomo Comes <comes@naic.edu> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Takashi Iwai <tiwai@suse.de> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Dan Carpenter authored
We need to check that "ctx" is a valid pointer before dereferencing it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Otherwise we end up trying to unpin a freed object and BUG. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
This prevents a WARN introduced with commit de2b9985 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jul 4 22:52:50 2012 +0200 drm/i915: don't return a spurious -EIO from intel_ring_begin Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
The issue is that we stale data in the CPU caches, when we come to swap-out the object, the CPU may short-circuit the reads from those cacheline and so corrupt the context object. Secondary, leaving the context object as being marked in the CPU write domain whilst on the GPU active list is a bad idea and will throw warnings later. Note: Thanks to calling set_to_gtt_domain with write = false and not setting any gpu write domain when putting a context object onto the active list (when we switch away from it) the set_to_gtt_domain call won't block. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Added a note to the commit message and a comment in the code to explain the clever non-blocking trick.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 Jul, 2012 1 commit
-
-
Chris Wilson authored
As we take the struct_mutex lock to access the command-stream, there is a possibility that we may need to wait for a GPU hang and so should make the lock both interruptible and error-checking. References: https://bugs.freedesktop.org/show_bug.cgi?id=50069Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-