1. 09 Aug, 2011 4 commits
    • Sarah Sharp's avatar
      xhci: Fix failed enqueue in the middle of isoch TD. · 522989a2
      Sarah Sharp authored
      When an isochronous transfer is enqueued, xhci_queue_isoc_tx_prepare()
      will ensure that there is enough room on the transfer rings for all of the
      isochronous TDs for that URB.  However, when xhci_queue_isoc_tx() is
      enqueueing individual isoc TDs, the prepare_transfer() function can fail
      if the endpoint state has changed to disabled, error, or some other
      unknown state.
      
      With the current code, if Nth TD (not the first TD) fails, the ring is
      left in a sorry state.  The partially enqueued TDs are left on the ring,
      and the first TRB of the TD is not given back to the hardware.  The
      enqueue pointer is left on the TRB after the last successfully enqueued
      TD.  This means the ring is basically useless.  Any new transfers will be
      enqueued after the failed TDs, which the hardware will never read because
      the cycle bit indicates it does not own them.  The ring will fill up with
      untransferred TDs, and the endpoint will be basically unusable.
      
      The untransferred TDs will also remain on the TD list.  Since the td_list
      is a FIFO, this basically means the ring handler will be waiting on TDs
      that will never be completed (or worse, dereference memory that doesn't
      exist any more).
      
      Change the code to clean up the isochronous ring after a failed transfer.
      If the first TD failed, simply return and allow the xhci_urb_enqueue
      function to free the urb_priv.  If the Nth TD failed, first remove the TDs
      from the td_list.  Then convert the TRBs that were enqueued into No-op
      TRBs.  Make sure to flip the cycle bit on all enqueued TRBs (including any
      link TRBs in the middle or between TDs), but leave the cycle bit of the
      first TRB (which will show software-owned) intact.  Then move the ring
      enqueue pointer back to the first TRB and make sure to change the
      xhci_ring's cycle state to what is appropriate for that ring segment.
      
      This ensures that the No-op TRBs will be overwritten by subsequent TDs,
      and the hardware will not start executing random TRBs because the cycle
      bit was left as hardware-owned.
      
      This bug is unlikely to be hit, but it was something I noticed while
      tracking down the watchdog timer issue.  I verified that the fix works by
      injecting some errors on the 250th isochronous URB queued, although I
      could not verify that the ring is in the correct state because uvcvideo
      refused to talk to the device after the first usb_submit_urb() failed.
      Ring debugging shows that the ring looks correct, however.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      522989a2
    • Sarah Sharp's avatar
      xhci: Fix memory leak during failed enqueue. · d13565c1
      Sarah Sharp authored
      When the isochronous transfer support was introduced, and the xHCI driver
      switched to using urb->hcpriv to store an "urb_priv" pointer, a couple of
      memory leaks were introduced into the URB enqueue function in its error
      handling paths.
      
      xhci_urb_enqueue allocates urb_priv, but it doesn't free it if changing
      the control endpoint's max packet size fails or the bulk endpoint is in
      the middle of allocating or deallocating streams.
      
      xhci_urb_enqueue also doesn't free urb_priv if any of the four endpoint
      types' enqueue functions fail.  Instead, it expects those functions to
      free urb_priv if an error occurs.  However, the bulk, control, and
      interrupt enqueue functions do not free urb_priv if the endpoint ring is
      NULL.  It will, however, get freed if prepare_transfer() fails in those
      enqueue functions.
      
      Several of the error paths in the isochronous endpoint enqueue function
      also fail to free it.  xhci_queue_isoc_tx_prepare() doesn't free urb_priv
      if prepare_ring() indicates there is not enough room for all the
      isochronous TDs in this URB.  If individual isochronous TDs fail to be
      queued (perhaps due to an endpoint state change), urb_priv is also leaked.
      
      This argues that the freeing of urb_priv should be done in the function
      that allocated it, xhci_urb_enqueue.
      
      This patch looks rather ugly, but refactoring the code will have to wait
      because this patch needs to be backported to stable kernels.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      d13565c1
    • Andiry Xu's avatar
      xHCI: report USB2 port in resuming as suspend · 8a8ff2f9
      Andiry Xu authored
      When a USB2 port initiate a remote wakeup, software shall ensure that
      resume is signaled for at least 20ms, and then write '0' to the PLS field.
      According to this, xhci driver do the following things:
      
      1. When receive a remote wakeup event in irq_handler, set the resume_done
         value as jiffies + 20ms, and modify rh_timer to poll root hub status at
         that time;
      2. When receive a GetPortStatus request, if the jiffies is after the
         resume_done value, clear the resume signal and resume_done.
      
      However, if usb_port_resume() is called before the rh_timer triggered, it
      will indicate the port as Suspend Cleared and skip the clear resume signal
      part. The device will fail the usb_get_status request in finish_port_resume(),
      and usbcore will try a reset-resume instead. Device will work OK after
      reset-resume, but resume_done value is not cleared in this case, and
      xhci_bus_suspend() will fail because when it finds a non-zero resume_done
      value, it will regard the port as resuming and return -EBUSY.
      
      This causes issue on some platforms that the system fail to suspend
      after remote wakeup from suspend by USB2 devices connected to xHCI port.
      
      To fix this issue, report the port status as suspend if the resume is
      signaling less that 20ms, and usb_port_resume() will wait 25ms and check
      port status again, so xHCI driver can clear the resume signaling and
      resume_done value.
      
      This should be backported to kernels as old as 2.6.37.
      Signed-off-by: default avatarAndiry Xu <andiry.xu@amd.com>
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@kernel.org
      8a8ff2f9
    • Andiry Xu's avatar
      xHCI: fix port U3 status check condition · 5ac04bf1
      Andiry Xu authored
      Fix the port U3 status check when Clear PORT_SUSPEND Feature.
      The port status should be masked with PORT_PLS_MASK to check if it's in
      U3 state.
      
      This should be backported to kernels as old as 2.6.37.
      Signed-off-by: default avatarAndiry Xu <andiry.xu@amd.com>
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@kernel.org
      5ac04bf1
  2. 08 Aug, 2011 19 commits
  3. 07 Aug, 2011 17 commits
    • Linus Torvalds's avatar
      9e233113
    • Rafael J. Wysocki's avatar
      sh: Fix boot crash related to SCI · fc97114b
      Rafael J. Wysocki authored
      Commit d006199e72a9 ("serial: sh-sci: Regtype probing doesn't need to be
      fatal.") made sci_init_single() return when sci_probe_regmap() succeeds,
      although it should return when sci_probe_regmap() fails.  This causes
      systems using the serial sh-sci driver to crash during boot.
      
      Fix the problem by using the right return condition.
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fc97114b
    • Linus Torvalds's avatar
      arm: remove stale export of 'sha_transform' · f23c126b
      Linus Torvalds authored
      The generic library code already exports the generic function, this was
      left-over from the ARM-specific version that just got removed.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f23c126b
    • Linus Torvalds's avatar
      arm: remove "optimized" SHA1 routines · 4d448714
      Linus Torvalds authored
      Since commit 1eb19a12 ("lib/sha1: use the git implementation of
      SHA-1"), the ARM SHA1 routines no longer work.  The reason? They
      depended on the larger 320-byte workspace, and now the sha1 workspace is
      just 16 words (64 bytes).  So the assembly version would overwrite the
      stack randomly.
      
      The optimized asm version is also probably slower than the new improved
      C version, so there's no reason to keep it around.  At least that was
      the case in git, where what appears to be the same assembly language
      version was removed two years ago because the optimized C BLK_SHA1 code
      was faster.
      Reported-and-tested-by: default avatarJoachim Eastwood <manabian@gmail.com>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Cc: Nicolas Pitre <nico@fluxnic.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4d448714
    • Al Viro's avatar
      fix rcu annotations noise in cred.h · 32955148
      Al Viro authored
      task->cred is declared as __rcu, and access to other tasks' ->cred is,
      indeed, protected.  Access to current->cred does not need rcu_dereference()
      at all, since only the task itself can change its ->cred.  sparse, of
      course, has no way of knowing that...
      
      Add force-cast in current_cred(), make current_fsuid() et.al. use it.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      32955148
    • Linus Torvalds's avatar
      vfs: rename 'do_follow_link' to 'should_follow_link' · 7813b94a
      Linus Torvalds authored
      Al points out that the do_follow_link() helper function really is
      misnamed - it's about whether we should try to follow a symlink or not,
      not about actually doing the following.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7813b94a
    • Ari Savolainen's avatar
      Fix POSIX ACL permission check · 206b1d09
      Ari Savolainen authored
      After commit 3567866b: "RCUify freeing acls, let check_acl() go ahead in
      RCU mode if acl is cached" posix_acl_permission is being called with an
      unsupported flag and the permission check fails. This patch fixes the issue.
      Signed-off-by: default avatarAri Savolainen <ari.m.savolainen@gmail.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      206b1d09
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · c2f340a6
      Linus Torvalds authored
      * 'for-linus' of git://git.open-osd.org/linux-open-osd:
        ore: Make ore its own module
        exofs: Rename raid engine from exofs/ios.c => ore
        exofs: ios: Move to a per inode components & device-table
        exofs: Move exofs specific osd operations out of ios.c
        exofs: Add offset/length to exofs_get_io_state
        exofs: Fix truncate for the raid-groups case
        exofs: Small cleanup of exofs_fill_super
        exofs: BUG: Avoid sbi realloc
        exofs: Remove pnfs-osd private definitions
        nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
      c2f340a6
    • Linus Torvalds's avatar
      vfs: optimize inode cache access patterns · 3ddcd056
      Linus Torvalds authored
      The inode structure layout is largely random, and some of the vfs paths
      really do care.  The path lookup in particular is already quite D$
      intensive, and profiles show that accessing the 'inode->i_op->xyz'
      fields is quite costly.
      
      We already optimized the dcache to not unnecessarily load the d_op
      structure for members that are often NULL using the DCACHE_OP_xyz bits
      in dentry->d_flags, and this does something very similar for the inode
      ops that are used during pathname lookup.
      
      It also re-orders the fields so that the fields accessed by 'stat' are
      together at the beginning of the inode structure, and roughly in the
      order accessed.
      
      The effect of this seems to be in the 1-2% range for an empty kernel
      "make -j" run (which is fairly kernel-intensive, mostly in filename
      lookup), so it's visible.  The numbers are fairly noisy, though, and
      likely depend a lot on exact microarchitecture.  So there's more tuning
      to be done.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ddcd056
    • Linus Torvalds's avatar
      vfs: renumber DCACHE_xyz flags, remove some stale ones · 830c0f0e
      Linus Torvalds authored
      Gcc tends to generate better code with small integers, including the
      DCACHE_xyz flag tests - so move the common ones to be first in the list.
      Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
      DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
      tree.
      
      And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
      common case to be a nice straight-line fall-through.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      830c0f0e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7cd4767e
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: Compute protocol sequence numbers and fragment IDs using MD5.
        crypto: Move md5_transform to lib/md5.c
      7cd4767e
    • Boaz Harrosh's avatar
      ore: Make ore its own module · cf283ade
      Boaz Harrosh authored
      Export everything from ore need exporting. Change Kbuild and Kconfig
      to build ore.ko as an independent module. Import ore from exofs
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      cf283ade
    • Boaz Harrosh's avatar
      exofs: Rename raid engine from exofs/ios.c => ore · 8ff660ab
      Boaz Harrosh authored
      ORE stands for "Objects Raid Engine"
      
      This patch is a mechanical rename of everything that was in ios.c
      and its API declaration to an ore.c and an osd_ore.h header. The ore
      engine will later be used by the pnfs objects layout driver.
      
      * File ios.c => ore.c
      
      * Declaration of types and API are moved from exofs.h to a new
        osd_ore.h
      
      * All used types are prefixed by ore_ from their exofs_ name.
      
      * Shift includes from exofs.h to osd_ore.h so osd_ore.h is
        independent, include it from exofs.h.
      
      Other than a pure rename there are no other changes. Next patch
      will move the ore into it's own module and will export the API
      to be used by exofs and later the layout driver
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      8ff660ab
    • Boaz Harrosh's avatar
      exofs: ios: Move to a per inode components & device-table · 9e9db456
      Boaz Harrosh authored
      Exofs raid engine was saving on memory space by having a single layout-info,
      single pid, and a single device-table, global to the filesystem. Then passing
      a credential and object_id info at the io_state level, private for each
      inode. It would also devise this contraption of rotating the device table
      view for each inode->ino to spread out the device usage.
      
      This is not compatible with the pnfs-objects standard, demanding that
      each inode can have it's own layout-info, device-table, and each object
      component it's own pid, oid and creds.
      
      So: Bring exofs raid engine to be usable for generic pnfs-objects use by:
      
      * Define an exofs_comp structure that holds obj_id and credential info.
      
      * Break up exofs_layout struct to an exofs_components structure that holds a
        possible array of exofs_comp and the array of devices + the size of the
        arrays.
      
      * Add a "comps" parameter to get_io_state() that specifies the ids creds
        and device array to use for each IO.
      
        This enables to keep the layout global, but the device-table view, creds
        and IDs at the inode level. It only adds two 64bit to each inode, since
        some of these members already existed in another form.
      
      * ios raid engine now access layout-info and comps-info through the passed
        pointers. Everything is pre-prepared by caller for generic access of
        these structures and arrays.
      
      At the exofs Level:
      
      * Super block holds an exofs_components struct that holds the device
        array, previously in layout. The devices there are in device-table
        order. The device-array is twice bigger and repeats the device-table
        twice so now each inode's device array can point to a random device
        and have a round-robin view of the table, making it compatible to
        previous exofs versions.
      
      * Each inode has an exofs_components struct that is initialized at
        load time, with it's own view of the device table IDs and creds.
        When doing IO this gets passed to the io_state together with the
        layout.
      
      While preforming this change. Bugs where found where credentials with the
      wrong IDs where used to access the different SB objects (super.c). As well
      as some dead code. It was never noticed because the target we use does not
      check the credentials.
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      9e9db456
    • Boaz Harrosh's avatar
      exofs: Move exofs specific osd operations out of ios.c · 85e44df4
      Boaz Harrosh authored
      ios.c will be moving to an external library, for use by the
      objects-layout-driver. Remove from it some exofs specific functions.
      
      Also g_attr_logical_length is used both by inode.c and ios.c
      move definition to the later, to keep it independent
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      85e44df4
    • Boaz Harrosh's avatar
      exofs: Add offset/length to exofs_get_io_state · e1042ba0
      Boaz Harrosh authored
      In future raid code we will need to know the IO offset/length
      and if it's a read or write to determine some of the array
      sizes we'll need.
      
      So add a new exofs_get_rw_state() API for use when
      writeing/reading. All other simple cases are left using the
      old way.
      
      The major change to this is that now we need to call
      exofs_get_io_state later at inode.c::read_exec and
      inode.c::write_exec when we actually know these things. So this
      patch is kept separate so I can test things apart from other
      changes.
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      e1042ba0
    • David S. Miller's avatar
      net: Compute protocol sequence numbers and fragment IDs using MD5. · 6e5714ea
      David S. Miller authored
      Computers have become a lot faster since we compromised on the
      partial MD4 hash which we use currently for performance reasons.
      
      MD5 is a much safer choice, and is inline with both RFC1948 and
      other ISS generators (OpenBSD, Solaris, etc.)
      
      Furthermore, only having 24-bits of the sequence number be truly
      unpredictable is a very serious limitation.  So the periodic
      regeneration and 8-bit counter have been removed.  We compute and
      use a full 32-bit sequence number.
      
      For ipv6, DCCP was found to use a 32-bit truncated initial sequence
      number (it needs 43-bits) and that is fixed here as well.
      Reported-by: default avatarDan Kaminsky <dan@doxpara.com>
      Tested-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e5714ea