1. 22 Aug, 2013 2 commits
    • Martin Peschke's avatar
      [SCSI] zfcp: fix schedule-inside-lock in scsi_device list loops · 924dd584
      Martin Peschke authored
      BUG: sleeping function called from invalid context at kernel/workqueue.c:2752
      in_atomic(): 1, irqs_disabled(): 1, pid: 360, name: zfcperp0.0.1700
      CPU: 1 Not tainted 3.9.3+ #69
      Process zfcperp0.0.1700 (pid: 360, task: 0000000075b7e080, ksp: 000000007476bc30)
      <snip>
      Call Trace:
      ([<00000000001165de>] show_trace+0x106/0x154)
       [<00000000001166a0>] show_stack+0x74/0xf4
       [<00000000006ff646>] dump_stack+0xc6/0xd4
       [<000000000017f3a0>] __might_sleep+0x128/0x148
       [<000000000015ece8>] flush_work+0x54/0x1f8
       [<00000000001630de>] __cancel_work_timer+0xc6/0x128
       [<00000000005067ac>] scsi_device_dev_release_usercontext+0x164/0x23c
       [<0000000000161816>] execute_in_process_context+0x96/0xa8
       [<00000000004d33d8>] device_release+0x60/0xc0
       [<000000000048af48>] kobject_release+0xa8/0x1c4
       [<00000000004f4bf2>] __scsi_iterate_devices+0xfa/0x130
       [<000003ff801b307a>] zfcp_erp_strategy+0x4da/0x1014 [zfcp]
       [<000003ff801b3caa>] zfcp_erp_thread+0xf6/0x2b0 [zfcp]
       [<000000000016b75a>] kthread+0xf2/0xfc
       [<000000000070c9de>] kernel_thread_starter+0x6/0xc
       [<000000000070c9d8>] kernel_thread_starter+0x0/0xc
      
      Apparently, the ref_count for some scsi_device drops down to zero,
      triggering device removal through execute_in_process_context(), while
      the lldd error recovery thread iterates through a scsi device list.
      Unfortunately, execute_in_process_context() decides to immediately
      execute that device removal function, instead of scheduling asynchronous
      execution, since it detects process context and thinks it is safe to do
      so. But almost all calls to shost_for_each_device() in our lldd are
      inside spin_lock_irq, even in thread context. Obviously, schedule()
      inside spin_lock_irq sections is a bad idea.
      
      Change the lldd to use the proper iterator function,
      __shost_for_each_device(), in combination with required locking.
      
      Occurences that need to be changed include all calls in zfcp_erp.c,
      since those might be executed in zfcp error recovery thread context
      with a lock held.
      
      Other occurences of shost_for_each_device() in zfcp_fsf.c do not
      need to be changed (no process context, no surrounding locking).
      
      The problem was introduced in Linux 2.6.37 by commit
      b62a8d9b
      "[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit".
      Reported-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMartin Peschke <mpeschke@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org #2.6.37+
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      924dd584
    • Martin Peschke's avatar
      [SCSI] zfcp: fix lock imbalance by reworking request queue locking · d79ff142
      Martin Peschke authored
      This patch adds wait_event_interruptible_lock_irq_timeout(), which is a
      straight-forward descendant of wait_event_interruptible_timeout() and
      wait_event_interruptible_lock_irq().
      
      The zfcp driver used to call wait_event_interruptible_timeout()
      in combination with some intricate and error-prone locking. Using
      wait_event_interruptible_lock_irq_timeout() as a replacement
      nicely cleans up that locking.
      
      This rework removes a situation that resulted in a locking imbalance
      in zfcp_qdio_sbal_get():
      
      BUG: workqueue leaked lock or atomic: events/1/0xffffff00/10
          last function: zfcp_fc_wka_port_offline+0x0/0xa0 [zfcp]
      
      It was introduced by commit c2af7545
      "[SCSI] zfcp: Do not wait for SBALs on stopped queue", which had a new
      code path related to ZFCP_STATUS_ADAPTER_QDIOUP that took an early exit
      without a required lock being held. The problem occured when a
      special, non-SCSI I/O request was being submitted in process context,
      when the adapter's queues had been torn down. In this case the bug
      surfaced when the Fibre Channel port connection for a well-known address
      was closed during a concurrent adapter shut-down procedure, which is a
      rare constellation.
      
      This patch also fixes these warnings from the sparse tool (make C=1):
      
      drivers/s390/scsi/zfcp_qdio.c:224:12: warning: context imbalance in
       'zfcp_qdio_sbal_check' - wrong count at exit
      drivers/s390/scsi/zfcp_qdio.c:244:5: warning: context imbalance in
       'zfcp_qdio_sbal_get' - unexpected unlock
      
      Last but not least, we get rid of that crappy lock-unlock-lock
      sequence at the beginning of the critical section.
      
      It is okay to call zfcp_erp_adapter_reopen() with req_q_lock held.
      Reported-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Reported-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Peschke <mpeschke@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org #2.6.35+
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      d79ff142
  2. 21 Aug, 2013 2 commits
    • Roland Dreier's avatar
      [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal · 35dc2483
      Roland Dreier authored
      There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
      leads to one process writing data into the address space of some other
      random unrelated process if the ioctl is interrupted by a signal.
      What happens is the following:
      
       - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
         underlying SCSI command will transfer data from the SCSI device to
         the buffer provided in the ioctl)
      
       - Before the command finishes, a signal is sent to the process waiting
         in the ioctl.  This will end up waking up the sg_ioctl() code:
      
      		result = wait_event_interruptible(sfp->read_wait,
      			(srp_done(sfp, srp) || sdp->detached));
      
         but neither srp_done() nor sdp->detached is true, so we end up just
         setting srp->orphan and returning to userspace:
      
      		srp->orphan = 1;
      		write_unlock_irq(&sfp->rq_list_lock);
      		return result;	/* -ERESTARTSYS because signal hit process */
      
         At this point the original process is done with the ioctl and
         blithely goes ahead handling the signal, reissuing the ioctl, etc.
      
       - Eventually, the SCSI command issued by the first ioctl finishes and
         ends up in sg_rq_end_io().  At the end of that function, we run through:
      
      	write_lock_irqsave(&sfp->rq_list_lock, iflags);
      	if (unlikely(srp->orphan)) {
      		if (sfp->keep_orphan)
      			srp->sg_io_owned = 0;
      		else
      			done = 0;
      	}
      	srp->done = done;
      	write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
      
      	if (likely(done)) {
      		/* Now wake up any sg_read() that is waiting for this
      		 * packet.
      		 */
      		wake_up_interruptible(&sfp->read_wait);
      		kill_fasync(&sfp->async_qp, SIGPOLL, POLL_IN);
      		kref_put(&sfp->f_ref, sg_remove_sfp);
      	} else {
      		INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext);
      		schedule_work(&srp->ew.work);
      	}
      
         Since srp->orphan *is* set, we set done to 0 (assuming the
         userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN
         ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext()
         to run in a workqueue.
      
       - In workqueue context we go through sg_rq_end_io_usercontext() ->
         sg_finish_rem_req() -> blk_rq_unmap_user() -> ... ->
         bio_uncopy_user() -> __bio_copy_iov() -> copy_to_user().
      
         The key point here is that we are doing copy_to_user() on a
         workqueue -- that is, we're on a kernel thread with current->mm
         equal to whatever random previous user process was scheduled before
         this kernel thread.  So we end up copying whatever data the SCSI
         command returned to the virtual address of the buffer passed into
         the original ioctl, but it's quite likely we do this copying into a
         different address space!
      
      As suggested by James Bottomley <James.Bottomley@hansenpartnership.com>,
      add a check for current->mm (which is NULL if we're on a kernel thread
      without a real userspace address space) in bio_uncopy_user(), and skip
      the copy if we're on a kernel thread.
      
      There's no reason that I can think of for any caller of bio_uncopy_user()
      to want to do copying on a kernel thread with a random active userspace
      address space.
      
      Huge thanks to Costa Sapuntzakis <costa@purestorage.com> for the
      original pointer to this bug in the sg code.
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Tested-by: default avatarDavid Milburn <dmilburn@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      35dc2483
    • Anton Blanchard's avatar
      [SCSI] lpfc: Don't force CONFIG_GENERIC_CSUM on · f5944daa
      Anton Blanchard authored
      We want ppc64 to be able to select between optimised assembly
      checksum routines in big endian and the generic lib/checksum.c
      routines in little endian.
      
      The lpfc driver is forcing CONFIG_GENERIC_CSUM on which means
      we are unable to make the decision to enable it in the arch
      Kconfig. If the option exists it is always forced on.
      
      This got introduced in 3.10 via commit 6a7252fd ([SCSI] lpfc:
      fix up Kconfig dependencies). I spoke to Randy about it and
      the original issue was with CRC_T10DIF not being defined.
      
      As such, remove the select of CONFIG_GENERIC_CSUM.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: <stable@vger.kernel.org> # 3.10
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      f5944daa
  3. 12 Aug, 2013 5 commits
  4. 11 Aug, 2013 4 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · e5d081f4
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is three bug fixes: An fnic warning caused by sleeping under a
        lock, a major regression with our updated WRITE SAME/UNMAP logic which
        caused tons of USB devices (and one RAID card) to cease to function
        and a megaraid_sas firmware initialisation problem which causes kdump
        failures"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        [SCSI] Don't attempt to send extended INQUIRY command if skip_vpd_pages is set
        [SCSI] fnic: BUG: sleeping function called from invalid context during probe
        [SCSI] megaraid_sas: megaraid_sas driver init fails in kdump kernel
      e5d081f4
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 77f63b4d
      Linus Torvalds authored
      Pull powerpc fixes from Ben Herrenschmidt:
       "This includes small series from Michael Neuling to fix a couple of
        nasty remaining problems with the new Power8 support, also targeted at
        stable 3.10, without which some new userspace accessible registers
        aren't properly context switched, and in some case, can be clobbered
        by the user of transactional memory.
      
        Along with that, a few slightly more minor things, such as a missing
        Kconfig option to enable handling of denorm exceptions when not
        running under a hypervisor (or userspace will randomly crash when
        hitting denorms with the vector unit), some nasty bugs in the new
        pstore oops code, and other simple bug fixes worth having in now.
      
        Note: I picked up the two powerpc KVM fixes as Alex Graf asked me to
        handle KVM bits while he is on vacation.  However I'll let him decide
        whether they should go to -stable or not when he is back"
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs
        powerpc: Save the TAR register earlier
        powerpc: Fix context switch DSCR on POWER8
        powerpc: Rework setting up H/FSCR bit definitions
        powerpc: Fix hypervisor facility unavaliable vector number
        powerpc/kvm/book3s_pr: Return appropriate error when allocation fails
        powerpc/kvm: Add signed type cast for comparation
        powerpc/eeh: Add missing procfs entry for PowerNV
        powerpc/pseries: Add backward compatibilty to read old kernel oops-log
        powerpc/pseries: Fix buffer overflow when reading from pstore
        powerpc: On POWERNV enable PPC_DENORMALISATION by default
      77f63b4d
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 30b229bd
      Linus Torvalds authored
      Pull s390 kvm fixes from Paolo Bonzini:
       "Two fixes for s390"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: fix pfmf non-quiescing control handling
        KVM: s390: move kvm_guest_enter,exit closer to sie
      30b229bd
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 9e6bdaaa
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Some driver bugfixes for the I2C subsystem"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mv64xxx: Document the newly introduced allwinner compatible
        i2c: Fix Kontron PLD prescaler calculation
        i2c: i2c-mxs: Use DMA mode even for small transfers
      9e6bdaaa
  5. 10 Aug, 2013 6 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · d92581fc
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "These are assorted fixes, mostly from Josef nailing down xfstests
        runs.  Zach also has a long standing fix for problems with readdir
        wrapping f_pos (or ctx->pos)
      
        These patches were spread out over different bases, so I rebased
        things on top of rc4 and retested overnight"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        btrfs: don't loop on large offsets in readdir
        Btrfs: check to see if root_list is empty before adding it to dead roots
        Btrfs: release both paths before logging dir/changed extents
        Btrfs: allow splitting of hole em's when dropping extent cache
        Btrfs: make sure the backref walker catches all refs to our extent
        Btrfs: fix backref walking when we hit a compressed extent
        Btrfs: do not offset physical if we're compressed
        Btrfs: fix extent buffer leak after backref walking
        Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents
        btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified
      d92581fc
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · b8ea0d06
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Stable patch for lockd to fix Oopses due to inappropriate calls to
         utsname()->nodename
      
       - Stable patches for sunrpc to fix Oopses on shutdown when using
         AF_LOCAL sockets with rpcbind
      
       - Fix memory leak and error checking issues in nfs4_proc_lookup_mountpoint
      
       - Fix a regression with the sync mount option failing to work for nfs4
         mounts
      
       - Fix a writeback performance issue when doing cache invalidation
      
       - Remove an incorrect call to nfs_setsecurity in nfs_fhget
      
      * tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFSv4: Fix up nfs4_proc_lookup_mountpoint
        NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget()
        NFSv4: Fix the sync mount option for nfs4 mounts
        NFS: Fix writeback performance issue on cache invalidation
        SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister
        SUNRPC: Don't auto-disconnect from the local rpcbind socket
        LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs
      b8ea0d06
    • Linus Torvalds's avatar
      Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux · 022e5d09
      Linus Torvalds authored
      Pull nfsd fixes from Bruce Fields:
       "Some fixes for a 4.1 feature that in retrospect probably should have
        waited for 3.12....  But it appears to be working now"
      
      * 'for-3.11' of git://linux-nfs.org/~bfields/linux:
        nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID
        nfsd4: Fix MACH_CRED NULL dereference
      022e5d09
    • Linus Torvalds's avatar
      Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 1e24f76e
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A couple of USB-audio fixes that should also go to stable kernels"
      
      * tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: usb-audio: do not trust too-big wMaxPacketSize values
        ALSA: 6fire: fix DMA issues with URB transfer_buffer usage
      1e24f76e
    • Linus Torvalds's avatar
      Merge tag 'staging-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 8ae3f1d0
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are 3 small fixes for staging/IIO drivers for 3.11-rc5.  Nothing
        huge, two IIO driver fixes, and a zcache fix.  All of these have been
        in linux-next for a while"
      
      * tag 'staging-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: zcache: fix "zcache=" kernel parameter
        iio: ti_am335x_adc: Fix wrong samples received on 1st read
        iio:trigger: Fix use_count race condition
      8ae3f1d0
    • Linus Torvalds's avatar
      Merge tag 'usb-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · e6e8ac44
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are 3 small USB fixes for 3.11-rc5.
      
        One is a fix that the ChromeOS developers ran into on some Intel
        hardware, one is a build fix, and the last is a MAINTAINERS update to
        help people figure out where to send USB network driver patches.
      
        All of these have been in linux-next for a while"
      
      * tag 'usb-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        MAINTAINERS: Add separate section for USB NETWORKING DRIVERS
        usb: xhci: add missing dma-mapping.h includes
        usb: core: don't try to reset_device() a port that got just disconnected
      e6e8ac44
  6. 09 Aug, 2013 21 commits
    • Zach Brown's avatar
      btrfs: don't loop on large offsets in readdir · db62efbb
      Zach Brown authored
      When btrfs readdir() hits the last entry it sets the readdir offset to a
      huge value to stop buggy apps from breaking when the same name is
      returned by readdir() with concurrent rename()s.
      
      But unconditionally setting the offset to INT_MAX causes readdir() to
      loop returning any entries with offsets past INT_MAX.  It only takes a
      few hours of constant file creation and removal to create entries past
      INT_MAX.
      
      So let's set the huge offset to LLONG_MAX if the last entry has already
      overflowed 32bit loff_t.   Without large offsets behaviour is identical.
      With large offsets 64bit apps will work and 32bit apps will be no more
      broken than they currently are if they see large offsets.
      Signed-off-by: default avatarZach Brown <zab@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      db62efbb
    • Josef Bacik's avatar
      Btrfs: check to see if root_list is empty before adding it to dead roots · cfad392b
      Josef Bacik authored
      A user reported a panic when running with autodefrag and deleting snapshots.
      This is because we could end up trying to add the root to the dead roots list
      twice.  To fix this check to see if we are empty before adding ourselves to the
      dead roots list.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      cfad392b
    • Josef Bacik's avatar
      Btrfs: release both paths before logging dir/changed extents · f3b15ccd
      Josef Bacik authored
      The ceph guys tripped over this bug where we were still holding onto the
      original path that we used to copy the inode with when logging.  This is based
      on Chris's fix which was reported to fix the problem.  We need to drop the paths
      in two cases anyway so just move the drop up so that we don't have duplicate
      code.  Thanks,
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      f3b15ccd
    • Josef Bacik's avatar
      Btrfs: allow splitting of hole em's when dropping extent cache · ee20a983
      Josef Bacik authored
      I noticed while running multi-threaded fsync tests that sometimes fsck would
      complain about an improper gap.  This happens because we fail to add a hole
      extent to the file, which was happening when we'd split a hole EM because
      btrfs_drop_extent_cache was just discarding the whole em instead of splitting
      it.  So this patch fixes this by allowing us to split a hole em properly, which
      means that added holes actually get logged properly and we no longer see this
      fsck error.  Thankfully we're tolerant of these sort of problems so a user would
      not see any adverse effects of this bug, other than fsck complaining.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      ee20a983
    • Josef Bacik's avatar
      Btrfs: make sure the backref walker catches all refs to our extent · ed8c4913
      Josef Bacik authored
      Because we don't mess with the offset into the extent for compressed we will
      properly find both extents for this case
      
      [extent a][extent b][rest of extent a]
      
      but because we already added a ref for the front half we won't add the inode
      information for the second half.  This causes us to leak that memory and not
      print out the other offset when we do logical-resolve.  So fix this by calling
      ulist_add_merge and then add our eie to the existing entry if there is one.
      With this patch we get both offsets out of logical-resolve.  With this and the
      other 2 patches I've sent we now pass btrfs/276 on my vm with compress-force=lzo
      set.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      ed8c4913
    • Josef Bacik's avatar
      Btrfs: fix backref walking when we hit a compressed extent · 8ca15e05
      Josef Bacik authored
      If you do btrfs inspect-internal logical-resolve on a compressed extent that has
      been partly overwritten it won't find anything.  This is because we try and
      match the extent offset we've searched for based on the extent offset in the
      data extent entry.  However this doesn't work for compressed extents because the
      offsets are for the uncompressed size, not the compressed size.  So instead only
      do this check if we are not compressed, that way we can get an actual entry for
      the physical offset rather than nothing for compressed.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      8ca15e05
    • Josef Bacik's avatar
      Btrfs: do not offset physical if we're compressed · b76bb701
      Josef Bacik authored
      xfstest btrfs/276 was freaking out on slower boxes partly because fiemap was
      offsetting the physical based on the extent offset.  This is perfectly fine with
      uncompressed extents, however the extent offset is into the uncompressed area,
      not the compressed.  So we can return a physical value that isn't at all within
      the area we have allocated on disk.  Fix this by returning the start of the
      extent if it is compressed no matter what the offset.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      b76bb701
    • Liu Bo's avatar
      Btrfs: fix extent buffer leak after backref walking · b5b9b5b3
      Liu Bo authored
      commit 47fb091f(Btrfs: fix unlock after free on rewinded tree blocks)
      takes an extra increment on the reference of allocated dummy extent buffer, so now we
      cannot free this dummy one, and end up with extent buffer leak.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      b5b9b5b3
    • Liu Bo's avatar
      Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents · e68afa49
      Liu Bo authored
      For partial extents, snapshot-aware defrag does not work as expected,
      since
      a) we use the wrong logical offset to search for parents, which should be
         disk_bytenr + extent_offset, not just disk_bytenr,
      b) 'offset' returned by the backref walking just refers to key.offset, not
         the 'offset' stored in btrfs_extent_data_ref which is
         (key.offset - extent_offset).
      
      The reproducer:
      $ mkfs.btrfs sda
      $ mount sda /mnt
      $ btrfs sub create /mnt/sub
      $ for i in `seq 5 -1 1`; do dd if=/dev/zero of=/mnt/sub/foo bs=5k count=1 seek=$i conv=notrunc oflag=sync; done
      $ btrfs sub snap /mnt/sub /mnt/snap1
      $ btrfs sub snap /mnt/sub /mnt/snap2
      $ sync; btrfs filesystem defrag /mnt/sub/foo;
      $ umount /mnt
      $ btrfs-debug-tree sda (Here we can check whether the defrag operation is snapshot-awared.
      
      This addresses the above two problems.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      e68afa49
    • Jie Liu's avatar
      btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified · 7cddc193
      Jie Liu authored
      Create a small file and fallocate it to a big size with
      FALLOC_FL_KEEP_SIZE option, then truncate it back to the
      small size again, the disk free space is not changed back
      in this case. i.e,
      
      total 4
      -rw-r--r-- 1 root root 512 Jun 28 11:35 test
      
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G   56K  7.2G   1% /mnt
      
      -rw-r--r-- 1 root root 512 Jun 28 11:35 /mnt/test
      
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G  5.1G  2.2G  70% /mnt
      
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G  5.1G  2.2G  70% /mnt
      
      With this fix, the truncated up space is back as:
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G   56K  7.2G   1% /mnt
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      7cddc193
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 14e94194
      Linus Torvalds authored
      Pull ACPI and power management fixes from Rafael Wysocki:
      
       - ACPI-based memory hotplug stopped working after a recent change,
         because it's not possible to associate sufficiently many "physical"
         devices with one ACPI device object due to an artificial limit.  Fix
         from Rafael J Wysocki removes that limit and makes memory hotplug
         work again.
      
       - A change made in 3.9 uncovered a bug in the ACPI processor driver
         preventing NUMA nodes from being put offline due to an ordering
         issue.  Fix from Yasuaki Ishimatsu changes the ordering to make
         things work again.
      
       - One of the recent ACPI video commits (that hasn't been reverted so
         far) uncovered a bug in the code handling quirky BIOSes that caused
         some Asus machines to boot with backlight completely off which made
         it quite difficult to use them afterward.  Fix from Felipe Contreras
         improves the quirk to cover this particular case correctly.
      
       - A cpufreq user space interface change made in 3.10 inadvertently
         renamed the ignore_nice_load sysfs attribute to ignore_nice which
         resulted in some confusion.  Fix from Viresh Kumar changes the name
         back to ignore_nice_load.
      
       - An initialization ordering change made in 3.9 broke cpufreq on
         loongson2 boards.  Fix from Aaro Koskinen restores the correct
         initialization ordering there.
      
       - Fix breakage resulting from a mistake made in 3.9 and causing the
         detection of some graphics adapters (that were detected correctly
         before) to fail.  There are two objects representing the same PCIe
         port in the affected systems' ACPI tables and both appear as
         "enabled" and we are expected to guess which one to use.  We used to
         choose the right one before by pure luck, but when we tried to
         address another similar corner case, the luck went away.  This time
         we try to make our guessing a bit more educated which is reported to
         work on those systems.
      
       - The /proc/acpi/wakeup interface code is missing some locking which
         may lead to breakage if that file is written or read during hotplug
         of wakeup devices.  That should be rare but still possible, so it's
         better to start using the appropriate locking there.
      
      * tag 'pm+acpi-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: Try harder to resolve _ADR collisions for bridges
        cpufreq: rename ignore_nice as ignore_nice_load
        cpufreq: loongson2: fix regression related to clock management
        ACPI / processor: move try_offline_node() after acpi_unmap_lsapic()
        ACPI: Drop physical_node_id_bitmap from struct acpi_device
        ACPI / PM: Walk physical_node_list under physical_node_lock
        ACPI / video: improve quirk check in acpi_video_bqc_quirk()
      14e94194
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · fdafa7cf
      Linus Torvalds authored
      Pull hwmon fix from Guenter Roeck:
       "Fix bug in adt7470 driver which causes it to fail writing fan speed
        limits"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (adt7470) Fix incorrect return code check
      fdafa7cf
    • Linus Torvalds's avatar
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 79a6fb1a
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "Some driver fixes (em28xx, coda, usbtv, s5p, hdpvr and ml86v7667) and
        a fix for media DocBook"
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] em28xx: fix assignment of the eeprom data
        [media] hdpvr: fix iteration over uninitialized lists in hdpvr_probe()
        [media] usbtv: fix dependency
        [media] usbtv: Throw corrupted frames away
        [media] usbtv: Fix deinterlacing
        [media] v4l2: added missing mutex.h include to v4l2-ctrls.h
        [media] DocBook: upgrade media_api DocBook version to 4.2
        [media] ml86v7667: fix compile warning: 'ret' set but not used
        [media] s5p-g2d: Fix registration failure
        [media] media: coda: Fix DT driver data pointer for i.MX27
        [media] s5p-mfc: Fix input/output format reporting
      79a6fb1a
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 58c59bc9
      Linus Torvalds authored
      Pull HID fix from Jiri Kosina:
       "Revert of a patch which breaks enumeration workaround in
        hid-logitech-dj"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        Revert "HID: hid-logitech-dj: querying_devices was never set"
      58c59bc9
    • Linus Torvalds's avatar
      Merge tag 'fbdev-fixes-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux · 78ebf0e3
      Linus Torvalds authored
      Pull fbdev fixes from Tomi Valkeinen:
       - omapdss: compilation fix and DVI fix for PandaBoard
       - mxsfb: fix colors when using 18bit LCD bus
      
      * tag 'fbdev-fixes-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
        ARM: OMAP: dss-common: fix Panda's DVI DDC channel
        video: mxsfb: fix color settings for 18bit data bus and 32bpp
        OMAPDSS: analog-tv-connector: compile fix
      78ebf0e3
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 6a933166
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Mostly radeon, more fixes for dynamic power management which is is off
        by default for this release anyways, but there are a large number of
        testers, so I'd like to keep merging the fixes.
      
        Otherwise, radeon UVD fixes affecting suspend/resume regressions, i915
        regression fixes, one for your mac mini, ast, mgag200, cirrus ttm fix
        and one regression fix in the core"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (25 commits)
        drm: Don't pass negative delta to ktime_sub_ns()
        drm/radeon: make missing smc ucode non-fatal
        drm/radeon/dpm: require rlc for dpm
        drm/radeon/cik: use a mutex to properly lock srbm instanced registers
        drm/radeon: remove unnecessary unpin
        drm/radeon: add more UVD CS checking
        drm/radeon: stop sending invalid UVD destroy msg
        drm/radeon: only save UVD bo when we have open handles
        drm/radeon: always program the MC on startup
        drm/radeon: fix audio dto calculation on DCE3+ (v3)
        drm/radeon/dpm: disable sclk ss on rv6xx
        drm/radeon: fix halting UVD
        drm/radeon/dpm: adjust power state properly for UVD on SI
        drm/radeon/dpm: fix spread spectrum setup (v2)
        drm/radeon/dpm: adjust thermal protection requirements
        drm/radeon: select audio dto based on encoder id for DCE3
        drm/radeon: properly handle pm on gpu reset
        drm/i915: do not disable backlight on vgaswitcheroo switch off
        drm/i915: Don't call encoder's get_config unless encoder is active
        drm/i915: avoid brightness overflow when doing scale
        ...
      6a933166
    • Oleg Nesterov's avatar
      dlm: kill the unnecessary and wrong device_close()->recalc_sigpending() · 201d3dfa
      Oleg Nesterov authored
      device_close()->recalc_sigpending() is not needed, sigprocmask() takes
      care of TIF_SIGPENDING correctly.
      
      And without ->siglock it is racy and wrong, it can wrongly clear
      TIF_SIGPENDING and miss a signal.
      
      But even with this patch device_close() is still buggy:
      
        1. sigprocmask() should not be used, we have set_task_blocked(),
           but this is minor.
      
        2. We should never block SIGKILL or SIGSTOP, and this is what
           the code tries to do.
      
        3. This can't protect against SIGKILL or SIGSTOP anyway. Another
           thread can do signal_wake_up(), say, do_signal_stop() or
           complete_signal() or debugger.
      
        4. sigprocmask(SIG_BLOCK, allsigs) doesn't necessarily clears
           TIF_SIGPENDING, say, freezing() or ->jobctl.
      
        5. device_write() looks equally wrong by the same reason.
      
      Looks like, this tries to protect some wait_event_interruptible() logic
      from signals, it should be turned into uninterruptible wait.  Or we need
      to implement something like signals_stop/start for such a use-case.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      201d3dfa
    • Jiri Kosina's avatar
      Revert "HID: hid-logitech-dj: querying_devices was never set" · 8e5654ce
      Jiri Kosina authored
      This reverts commit 407a2c2a.
      
      Explanation provided by Benjamin Tissoires:
      
      Commit "HID: hid-logitech-dj, querying_devices was never set" activate
      a flag which guarantees that we do not ask the receiver for too many
      enumeration. When the flag is set, each following enumeration call is
      discarded (the usb request is not forwarded to the receiver). The flag
      is then released when the driver receive a pairing information event,
      which normally follows the enumeration request.
      However, the USB3 bug makes the driver think the enumeration request
      has been forwarded to the receiver. However, it is actually not the
      case because the USB stack returns -EPIPE. So, when a new unknown
      device appears, the workaround consisting in asking for a new
      enumeration is not working anymore: this new enumeration is discarded
      because of the flag, which is never reset.
      
      A solution could be to trigger a timeout before releasing it, but for
      now, let's just revert the patch.
      Reported-by: default avatarBenjamin Tissoires <benjamin.tissoires@gmail.com>
      Tested-by: default avatarSune Mølgaard <sune@molgaard.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      8e5654ce
    • Michael Neuling's avatar
      powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs · 28e61cc4
      Michael Neuling authored
      If a transaction is rolled back, the Target Address Register (TAR), Processor
      Priority Register (PPR) and Data Stream Control Register (DSCR) should be
      restored to the checkpointed values before the transaction began.  Any changes
      to these SPRs inside the transaction should not be visible in the abort
      handler.
      
      Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR.  If
      we preempt a processes inside a transaction which has modified any of these, on
      process restore, that same transaction may be aborted we but we won't see the
      checkpointed versions of these SPRs.
      
      This adds checkpointed versions of these SPRs to the thread_struct and adds the
      save/restore of these three SPRs to the treclaim/trechkpt code.
      
      Without this if any of these SPRs are modified during a transaction, users may
      incorrectly see a speculated SPR value even if the transaction is aborted.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      28e61cc4
    • Michael Neuling's avatar
      powerpc: Save the TAR register earlier · c2d52644
      Michael Neuling authored
      This moves us to save the Target Address Register (TAR) a earlier in
      __switch_to.  It introduces a new function save_tar() to do this.
      
      We need to save the TAR earlier as we will overwrite it in the transactional
      memory reclaim/recheckpoint path.  We are going to do this in a subsequent
      patch which will fix saving the TAR register when it's modified inside a
      transaction.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c2d52644
    • Michael Neuling's avatar
      powerpc: Fix context switch DSCR on POWER8 · 2517617e
      Michael Neuling authored
      POWER8 allows the DSCR to be accessed directly from userspace via a new SPR
      number 0x3 (Rather than 0x11.  DSCR SPR number 0x11 is still used on POWER8 but
      like POWER7, is only accessible in HV and OS modes).  Currently, we allow this
      by setting H/FSCR DSCR bit on boot.
      
      Unfortunately this doesn't work, as the kernel needs to see the DSCR change so
      that it knows to no longer restore the system wide version of DSCR on context
      switch (ie. to set thread.dscr_inherit).
      
      This clears the H/FSCR DSCR bit initially.  If a process then accesses the DSCR
      (via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in
      facility_unavailable_exception().
      
      We also change _switch() so that we set or clear the H/FSCR DSCR bit based on
      the thread.dscr_inherit.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Cc: <stable@vger.kernel.org> [v3.10]
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2517617e