1. 08 Jul, 2014 4 commits
    • Maarten Lankhorst's avatar
      fence: dma-buf cross-device synchronization (v18) · e941759c
      Maarten Lankhorst authored
      A fence can be attached to a buffer which is being filled or consumed
      by hw, to allow userspace to pass the buffer without waiting to another
      device.  For example, userspace can call page_flip ioctl to display the
      next frame of graphics after kicking the GPU but while the GPU is still
      rendering.  The display device sharing the buffer with the GPU would
      attach a callback to get notified when the GPU's rendering-complete IRQ
      fires, to update the scan-out address of the display, without having to
      wake up userspace.
      
      A driver must allocate a fence context for each execution ring that can
      run in parallel. The function for this takes an argument with how many
      contexts to allocate:
        + fence_context_alloc()
      
      A fence is transient, one-shot deal.  It is allocated and attached
      to one or more dma-buf's.  When the one that attached it is done, with
      the pending operation, it can signal the fence:
        + fence_signal()
      
      To have a rough approximation whether a fence is fired, call:
        + fence_is_signaled()
      
      The dma-buf-mgr handles tracking, and waiting on, the fences associated
      with a dma-buf.
      
      The one pending on the fence can add an async callback:
        + fence_add_callback()
      
      The callback can optionally be cancelled with:
        + fence_remove_callback()
      
      To wait synchronously, optionally with a timeout:
        + fence_wait()
        + fence_wait_timeout()
      
      When emitting a fence, call:
        + trace_fence_emit()
      
      To annotate that a fence is blocking on another fence, call:
        + trace_fence_annotate_wait_on(fence, on_fence)
      
      A default software-only implementation is provided, which can be used
      by drivers attaching a fence to a buffer when they have no other means
      for hw sync.  But a memory backed fence is also envisioned, because it
      is common that GPU's can write to, or poll on some memory location for
      synchronization.  For example:
      
        fence = custom_get_fence(...);
        if ((seqno_fence = to_seqno_fence(fence)) != NULL) {
          dma_buf *fence_buf = seqno_fence->sync_buf;
          get_dma_buf(fence_buf);
      
          ... tell the hw the memory location to wait ...
          custom_wait_on(fence_buf, seqno_fence->seqno_ofs, fence->seqno);
        } else {
          /* fall-back to sw sync * /
          fence_add_callback(fence, my_cb);
        }
      
      On SoC platforms, if some other hw mechanism is provided for synchronizing
      between IP blocks, it could be supported as an alternate implementation
      with it's own fence ops in a similar way.
      
      enable_signaling callback is used to provide sw signaling in case a cpu
      waiter is requested or no compatible hardware signaling could be used.
      
      The intention is to provide a userspace interface (presumably via eventfd)
      later, to be used in conjunction with dma-buf's mmap support for sw access
      to buffers (or for userspace apps that would prefer to do their own
      synchronization).
      
      v1: Original
      v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided
          that dma-fence didn't need to care about the sw->hw signaling path
          (it can be handled same as sw->sw case), and therefore the fence->ops
          can be simplified and more handled in the core.  So remove the signal,
          add_callback, cancel_callback, and wait ops, and replace with a simple
          enable_signaling() op which can be used to inform a fence supporting
          hw->hw signaling that one or more devices which do not support hw
          signaling are waiting (and therefore it should enable an irq or do
          whatever is necessary in order that the CPU is notified when the
          fence is passed).
      v3: Fix locking fail in attach_fence() and get_fence()
      v4: Remove tie-in w/ dma-buf..  after discussion w/ danvet and mlankorst
          we decided that we need to be able to attach one fence to N dma-buf's,
          so using the list_head in dma-fence struct would be problematic.
      v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager.
      v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments
          about checking if fence fired or not. This is broken by design.
          waitqueue_active during destruction is now fatal, since the signaller
          should be holding a reference in enable_signalling until it signalled
          the fence. Pass the original dma_fence_cb along, and call __remove_wait
          in the dma_fence_callback handler, so that no cleanup needs to be
          performed.
      v7: [ Maarten Lankhorst ] Set cb->func and only enable sw signaling if
          fence wasn't signaled yet, for example for hardware fences that may
          choose to signal blindly.
      v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to
          header and fixed include mess. dma-fence.h now includes dma-buf.h
          All members are now initialized, so kmalloc can be used for
          allocating a dma-fence. More documentation added.
      v9: Change compiler bitfields to flags, change return type of
          enable_signaling to bool. Rework dma_fence_wait. Added
          dma_fence_is_signaled and dma_fence_wait_timeout.
          s/dma// and change exports to non GPL. Added fence_is_signaled and
          fence_enable_sw_signaling calls, add ability to override default
          wait operation.
      v10: remove event_queue, use a custom list, export try_to_wake_up from
          scheduler. Remove fence lock and use a global spinlock instead,
          this should hopefully remove all the locking headaches I was having
          on trying to implement this. enable_signaling is called with this
          lock held.
      v11:
          Use atomic ops for flags, lifting the need for some spin_lock_irqsaves.
          However I kept the guarantee that after fence_signal returns, it is
          guaranteed that enable_signaling has either been called to completion,
          or will not be called any more.
      
          Add contexts and seqno to base fence implementation. This allows you
          to wait for less fences, by testing for seqno + signaled, and then only
          wait on the later fence.
      
          Add FENCE_TRACE, FENCE_WARN, and FENCE_ERR. This makes debugging easier.
          An CONFIG_DEBUG_FENCE will be added to turn off the FENCE_TRACE
          spam, and another runtime option can turn it off at runtime.
      v12:
          Add CONFIG_FENCE_TRACE. Add missing documentation for the fence->context
          and fence->seqno members.
      v13:
          Fixup CONFIG_FENCE_TRACE kconfig description.
          Move fence_context_alloc to fence.
          Simplify fence_later.
          Kill priv member to fence_cb.
      v14:
          Remove priv argument from fence_add_callback, oops!
      v15:
          Remove priv from documentation.
          Explicitly include linux/atomic.h.
      v16:
          Add trace events.
          Import changes required by android syncpoints.
      v17:
          Use wake_up_state instead of try_to_wake_up. (Colin Cross)
          Fix up commit description for seqno_fence. (Rob Clark)
      v18:
          Rename release_fence to fence_release.
          Move to drivers/dma-buf/.
          Rename __fence_is_signaled and __fence_signal to *_locked.
          Rename __fence_init to fence_init.
          Make fence_default_wait return a signed long, and fix wait ops too.
      Signed-off-by: default avatarMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Signed-off-by: Thierry Reding <thierry.reding@gmail.com> #use smp_mb__before_atomic()
      Acked-by: default avatarSumit Semwal <sumit.semwal@linaro.org>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Reviewed-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e941759c
    • Maarten Lankhorst's avatar
    • Greg Kroah-Hartman's avatar
      Merge branch 'component-for-driver' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into work-next · 650f81d4
      Greg Kroah-Hartman authored
      Russell writes:
      
      Greg,
      
      Please incorporate a fix for the component helper.
      This fixes a bug reported by Sachin Kamat found with Exynos DRM.
      650f81d4
    • Greg Kroah-Hartman's avatar
      Merge 3.16-rc4 into driver-core-next · 64c720ad
      Greg Kroah-Hartman authored
      We want the lz* fixes here to do more work with them.
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64c720ad
  2. 06 Jul, 2014 4 commits
    • Linus Torvalds's avatar
      Linux 3.16-rc4 · cd3de83f
      Linus Torvalds authored
      cd3de83f
    • Linus Torvalds's avatar
      Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux · 100193f5
      Linus Torvalds authored
      Pull devicetree bugfix from Grant Likely:
       "Important bug fix for parsing 64-bit addresses on 32-bit platforms.
        Without this patch the kernel will try to use memory ranges that
        cannot be reached"
      
      * tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
        of: Check for phys_addr_t overflows in early_init_dt_add_memory_arch
      100193f5
    • Linus Torvalds's avatar
      Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 8addf0c7
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a set of 13 fixes, a MAINTAINERS update and a sparse update.
        The fixes are mostly correct value initialisations, avoiding NULL
        derefs and some uninitialised pointer avoidance.
      
        All the patches have been incubated in -next for a few days.  The
        final patch (use the scsi data buffer length to extract transfer size)
        has been rebased to add a cc to stable, but only the commit message
        has changed"
      
      * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        [SCSI] use the scsi data buffer length to extract transfer size
        virtio-scsi: fix various bad behavior on aborted requests
        virtio-scsi: avoid cancelling uninitialized work items
        ibmvscsi: Add memory barriers for send / receive
        ibmvscsi: Abort init sequence during error recovery
        qla2xxx: Fix sparse warning in qla_target.c.
        bnx2fc: Improve stats update mechanism
        bnx2fc: do not scan uninitialized lists in case of error.
        fc: ensure scan_work isn't active when freeing fc_rport
        pm8001: Fix potential null pointer dereference and memory leak.
        MAINTAINERS: Update LSILOGIC MPT FUSION DRIVERS (FC/SAS/SPI) maintainers Email IDs
        be2iscsi: remove potential junk pointer free
        be2iscsi: add an missing goto in error path
        scsi_error: set DID_TIME_OUT correctly
        scsi_error: fix invalid setting of host byte
      8addf0c7
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 110e4308
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "i915, tda998x and vmwgfx fixes,
      
        The main one is i915 fix for missing VGA connectors, along with some
        fixes for the tda998x from Russell fixing some modesetting problems.
      
        (still on holidays, but got a spare moment to find these)"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/vmwgfx: Fix incorrect write to read-only register v2:
        drm/i915: Drop early VLV WA to fix Voltage not getting dropped to Vmin
        drm/i915: only apply crt_present check on VLV
        drm/i915: Wait for vblank after enabling the primary plane on BDW
        drm/i2c: tda998x: add some basic mode validation
        drm/i2c: tda998x: faster polling for edid
        drm/i2c: tda998x: move drm_i2c_encoder_destroy call
      110e4308
  3. 05 Jul, 2014 12 commits
  4. 04 Jul, 2014 16 commits
  5. 03 Jul, 2014 4 commits
    • Greg Kroah-Hartman's avatar
      lz4: add overrun checks to lz4_uncompress_unknownoutputsize() · 4a3a9904
      Greg Kroah-Hartman authored
      Jan points out that I forgot to make the needed fixes to the
      lz4_uncompress_unknownoutputsize() function to mirror the changes done
      in lz4_decompress() with regards to potential pointer overflows.
      
      The only in-kernel user of this function is the zram code, which only
      takes data from a valid compressed buffer that it made itself, so it's
      not a big issue.  But due to external kernel modules using this
      function, it's better to be safe here.
      Reported-by: default avatarJan Beulich <JBeulich@suse.com>
      Cc: "Don A. Bailey" <donb@securitymouse.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a3a9904
    • Greg Kroah-Hartman's avatar
      Merge branch 'component-for-driver' of... · 868b60e0
      Greg Kroah-Hartman authored
      Merge branch 'component-for-driver' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into driver-core-next
      
      Russell writes:
      
      These updates fix one bug in the component helper where the matched
      components are not properly cleaned up when the master fails to bind.
      I'll provide a version of this for stable trees if it's deemed that
      we need to backport it.
      
      The second patch causes the component helper to ignore duplicate
      matches when adding components - this is something that was originally
      needed for imx-drm, but since that has now been updated, we no longer
      need to skip over a component which has already been matched.
      
      The final patch starts the process of updating the component helper
      API to achieve two goals: to allow the API to be more efficient when
      deferred probing occurs, and to allow for future improvements to the
      component helper without having a major impact on the users.
      
      This represents groundwork for some other changes; once this has been
      merged, I will then send two further pull requests (one for the staging
      tree, and one for the DRM tree) to update the drivers to the new API.
      This will result in these three commits being shared with those trees.
      868b60e0
    • James Bottomley's avatar
    • Martin K. Petersen's avatar
      [SCSI] use the scsi data buffer length to extract transfer size · 5616b0a4
      Martin K. Petersen authored
      Commit 8846bab1 introduced a helper that can be used to query the
      wire transfer size for a SCSI command taking protection information into
      account.
      
      However, some commands do not have a 1:1 mapping between the block range
      they work on and the payload size (discard, write same). After the
      scatterlist has been set up these requests use __data_len to store the
      number of bytes to report completion on. This means that callers of
      scsi_transfer_length() would get the wrong byte count for these types of
      requests.
      
      To overcome this we make scsi_transfer_length() use the scatterlist
      length in the scsi_data_buffer as basis for the wire transfer
      calculation instead of __data_len.
      Reported-by: default avatarChristoph Hellwig <hch@infradead.org>
      Debugged-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Fixes: d77e6535
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      5616b0a4