1. 19 Aug, 2010 3 commits
  2. 18 Aug, 2010 37 commits
    • Linus Torvalds's avatar
      Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 · 763008c4
      Linus Torvalds authored
      * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
        NFS: Fix an Oops in the NFSv4 atomic open code
        NFS: Fix the selection of security flavours in Kconfig
        NFS: fix the return value of nfs_file_fsync()
        rpcrdma: Fix SQ size calculation when memreg is FRMR
        xprtrdma: Do not truncate iova_start values in frmr registrations.
        nfs: Remove redundant NULL check upon kfree()
        nfs: Add "lookupcache" to displayed mount options
        NFS: allow close-to-open cache semantics to apply to root of NFS filesystem
        SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)
      763008c4
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · d1126ad9
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        USB HID: Add ID for eGalax Multitouch used in JooJoo tablet
        HID: hiddev: fix memory corruption due to invalid intfdata
        HID: hiddev: protect against disconnect/NULL-dereference race
        HID: picolcd: correct ordering of framebuffer freeing
        HID: picolcd: testing the wrong variable
      d1126ad9
    • Linus Torvalds's avatar
      Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 · 2a554736
      Linus Torvalds authored
      * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
        [IA64] Fix build error: conflicting types for ‘sys_execve’
      2a554736
    • David Howells's avatar
      Fix the declaration of sys_execve() in asm-generic/syscalls.h · d15ca320
      David Howells authored
      Fix the declaration of sys_execve() in asm-generic/syscalls.h to have
      various consts applied to its pointers.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d15ca320
    • Tony Luck's avatar
      [IA64] Fix build error: conflicting types for ‘sys_execve’ · 145e5aa2
      Tony Luck authored
      arch/ia64/kernel/process.c:636: error: conflicting types for ‘sys_execve’
      
      commit d7627467
      Make do_execve() take a const filename pointer
      
      Missed the declaration of sys_execve in the ia64 asm/unistd.h (perhaps
      because there is no reason for it to be there ... it might be a left over
      from the COMPAT code?). Just delete the conflicting version.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      145e5aa2
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 · 145c3ae4
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
        fs: brlock vfsmount_lock
        fs: scale files_lock
        lglock: introduce special lglock and brlock spin locks
        tty: fix fu_list abuse
        fs: cleanup files_lock locking
        fs: remove extra lookup in __lookup_hash
        fs: fs_struct rwlock to spinlock
        apparmor: use task path helpers
        fs: dentry allocation consolidation
        fs: fix do_lookup false negative
        mbcache: Limit the maximum number of cache entries
        hostfs ->follow_link() braino
        hostfs: dumb (and usually harmless) tpyo - strncpy instead of strlcpy
        remove SWRITE* I/O types
        kill BH_Ordered flag
        vfs: update ctime when changing the file's permission by setfacl
        cramfs: only unlock new inodes
        fix reiserfs_evict_inode end_writeback second call
      145c3ae4
    • Uwe Kleine-König's avatar
      mmc: build fix: mmc_pm_notify is only available with CONFIG_PM=y · 81ca03a0
      Uwe Kleine-König authored
      This fixes a build breakage introduced by commit 4c2ef25f ("mmc: fix
      all hangs related to mmc/sd card insert/removal during suspend/resume")
      
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: linux-mmc@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Acked-by: default avatarKukjin Kim <kgene.kim@samsung.com>
      Acked-by: default avatarMaxim Levitsky <maximlevitsky@gmail.com>
      Acked-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      81ca03a0
    • Kusanagi Kouichi's avatar
      perf tools: Fix build error on read only source. · ecafda60
      Kusanagi Kouichi authored
      Parts of the build process were generating files outside the specified
      O= directory, causing the build to fail on systems where the sources are
      in a read only file system.
      
      Fix it by using $(OUTPUT) on these locations.
      
      Also check that $(OUTPUT) actually exists, just like the top level
      kernel Makefile does. Otherwise the failure message emitted is
      completely misleading.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20100817140841.0859362C03A@msa106.auone-net.jp>
      Signed-off-by: default avatarKusanagi Kouichi <slash@ac.auone-net.jp>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ecafda60
    • Linus Torvalds's avatar
      Merge branch 'perf-fixes-for-linus' of... · 1ca72feb
      Linus Torvalds authored
      Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        perf tools: Fix build on POSIX shells
        latencytop: Fix kconfig dependency warnings
        perf annotate tui: Fix exit and RIGHT keys handling
        tracing: Sanitize value returned from write(trace_marker, "...", len)
        tracing/events: Convert format output to seq_file
        tracing: Extend recordmcount to better support Blackfin mcount
        tracing: Fix ring_buffer_read_page reading out of page boundary
        tracing: Fix an unallocated memory access in function_graph
      1ca72feb
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 · 7dfb2d40
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
        ALSA: emu10k1 - delay the PCM interrupts (add pcm_irq_delay parameter)
        ALSA: hda - Fix ALC680 base model capture
        ASoC: Remove DSP mode support for WM8776
        ALSA: hda - Add quirk for Dell Vostro 1220
        ALSA: riptide - Fix detection / load of firmware files
      7dfb2d40
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · 6c8bfb7f
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
        m68knommu: include sched.h in ColdFire/SPI driver
        m68knommu: formatting of pointers in printk()
        m68knommu: arch/m68k/include/asm/ide.h fix for nommu
      6c8bfb7f
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://neil.brown.name/md · d9f5d415
      Linus Torvalds authored
      * 'for-linus' of git://neil.brown.name/md:
        md raid-1/10 Fix bio_rw bit manipulations again
        md: provide appropriate return value for spare_active functions.
        md: Notify sysfs when RAID1/5/10 disk is In_sync.
        Update recovery_offset even when external metadata is used.
      d9f5d415
    • Linus Torvalds's avatar
      Merge branch 'merge-devicetree' of git://git.secretlab.ca/git/linux-2.6 · 86ea51d4
      Linus Torvalds authored
      * 'merge-devicetree' of git://git.secretlab.ca/git/linux-2.6:
        spi.h: missing kernel-doc notation, please fix
        of: fix missing headers for of_address_to_resource() in MTD and SysACE drivers
        of: Fix missing includes
        ata: update for of_device to platform_device replacement
        microblaze: Fix of: eliminate of_device->node and dev_archdata->{of,prom}_node
        microblaze: Fix of/address: Merge all of the bus translation code
        booting-without-of: Remove nonexistent chapters from TOC, fix numbering
      86ea51d4
    • Trond Myklebust's avatar
      NFS: Fix an Oops in the NFSv4 atomic open code · 0a377cff
      Trond Myklebust authored
      Adam Lackorzynski reports:
      
      with 2.6.35.2 I'm getting this reproducible Oops:
      
      [  110.825396] BUG: unable to handle kernel NULL pointer dereference at
      (null)
      [  110.828638] IP: [<ffffffff811247b7>] encode_attrs+0x1a/0x2a4
      [  110.828638] PGD be89f067 PUD bf18f067 PMD 0
      [  110.828638] Oops: 0000 [#1] SMP
      [  110.828638] last sysfs file: /sys/class/net/lo/operstate
      [  110.828638] CPU 2
      [  110.828638] Modules linked in: rtc_cmos rtc_core rtc_lib amd64_edac_mod
      i2c_amd756 edac_core i2c_core dm_mirror dm_region_hash dm_log dm_snapshot
      sg sr_mod usb_storage ohci_hcd mptspi tg3 mptscsih mptbase usbcore nls_base
      [last unloaded: scsi_wait_scan]
      [  110.828638]
      [  110.828638] Pid: 11264, comm: setchecksum Not tainted 2.6.35.2 #1
      [  110.828638] RIP: 0010:[<ffffffff811247b7>]  [<ffffffff811247b7>]
      encode_attrs+0x1a/0x2a4
      [  110.828638] RSP: 0000:ffff88003bf5b878  EFLAGS: 00010296
      [  110.828638] RAX: ffff8800bddb48a8 RBX: ffff88003bf5bb18 RCX:
      0000000000000000
      [  110.828638] RDX: ffff8800be258800 RSI: 0000000000000000 RDI:
      ffff88003bf5b9f8
      [  110.828638] RBP: 0000000000000000 R08: ffff8800bddb48a8 R09:
      0000000000000004
      [  110.828638] R10: 0000000000000003 R11: ffff8800be779000 R12:
      ffff8800be258800
      [  110.828638] R13: ffff88003bf5b9f8 R14: ffff88003bf5bb20 R15:
      ffff8800be258800
      [  110.828638] FS:  0000000000000000(0000) GS:ffff880041e00000(0063)
      knlGS:00000000556bd6b0
      [  110.828638] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
      [  110.828638] CR2: 0000000000000000 CR3: 00000000be8ef000 CR4:
      00000000000006e0
      [  110.828638] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  110.828638] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
      0000000000000400
      [  110.828638] Process setchecksum (pid: 11264, threadinfo
      ffff88003bf5a000, task ffff88003f232210)
      [  110.828638] Stack:
      [  110.828638]  0000000000000000 ffff8800bfbcf920 0000000000000000
      0000000000000ffe
      [  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
      0000000000000000
      [  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
      0000000000000000
      [  110.828638] Call Trace:
      [  110.828638]  [<ffffffff81124c1f>] ? nfs4_xdr_enc_setattr+0x90/0xb4
      [  110.828638]  [<ffffffff81371161>] ? call_transmit+0x1c3/0x24a
      [  110.828638]  [<ffffffff813774d9>] ? __rpc_execute+0x78/0x22a
      [  110.828638]  [<ffffffff81371a91>] ? rpc_run_task+0x21/0x2b
      [  110.828638]  [<ffffffff81371b7e>] ? rpc_call_sync+0x3d/0x5d
      [  110.828638]  [<ffffffff8111e284>] ? _nfs4_do_setattr+0x11b/0x147
      [  110.828638]  [<ffffffff81109466>] ? nfs_init_locked+0x0/0x32
      [  110.828638]  [<ffffffff810ac521>] ? ifind+0x4e/0x90
      [  110.828638]  [<ffffffff8111e2fb>] ? nfs4_do_setattr+0x4b/0x6e
      [  110.828638]  [<ffffffff8111e634>] ? nfs4_do_open+0x291/0x3a6
      [  110.828638]  [<ffffffff8111ed81>] ? nfs4_open_revalidate+0x63/0x14a
      [  110.828638]  [<ffffffff811056c4>] ? nfs_open_revalidate+0xd7/0x161
      [  110.828638]  [<ffffffff810a2de4>] ? do_lookup+0x1a4/0x201
      [  110.828638]  [<ffffffff810a4733>] ? link_path_walk+0x6a/0x9d5
      [  110.828638]  [<ffffffff810a42b6>] ? do_last+0x17b/0x58e
      [  110.828638]  [<ffffffff810a5fbe>] ? do_filp_open+0x1bd/0x56e
      [  110.828638]  [<ffffffff811cd5e0>] ? _atomic_dec_and_lock+0x30/0x48
      [  110.828638]  [<ffffffff810a9b1b>] ? dput+0x37/0x152
      [  110.828638]  [<ffffffff810ae063>] ? alloc_fd+0x69/0x10a
      [  110.828638]  [<ffffffff81099f39>] ? do_sys_open+0x56/0x100
      [  110.828638]  [<ffffffff81027a22>] ? ia32_sysret+0x0/0x5
      [  110.828638] Code: 83 f1 01 e8 f5 ca ff ff 48 83 c4 50 5b 5d 41 5c c3 41
      57 41 56 41 55 49 89 fd 41 54 49 89 d4 55 48 89 f5 53 48 81 ec 18 01 00 00
      <8b> 06 89 c2 83 e2 08 83 fa 01 19 db 83 e3 f8 83 c3 18 a8 01 8d
      [  110.828638] RIP  [<ffffffff811247b7>] encode_attrs+0x1a/0x2a4
      [  110.828638]  RSP <ffff88003bf5b878>
      [  110.828638] CR2: 0000000000000000
      [  112.840396] ---[ end trace 95282e83fd77358f ]---
      
      We need to ensure that the O_EXCL flag is turned off if the user doesn't
      set O_CREAT.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      0a377cff
    • Takashi Iwai's avatar
      Merge branch 'fix/asoc' into for-linus · 2ea1ef57
      Takashi Iwai authored
      2ea1ef57
    • Takashi Iwai's avatar
      Merge branch 'fix/hda' into for-linus · 76165a30
      Takashi Iwai authored
      76165a30
    • Jaroslav Kysela's avatar
      ALSA: emu10k1 - delay the PCM interrupts (add pcm_irq_delay parameter) · 56385a12
      Jaroslav Kysela authored
      With some hardware combinations, the PCM interrupts are acknowledged
      before the period boundary from the emu10k1 chip. The midlevel PCM code
      gets confused and the playback stream is interrupted.
      
      It seems that the interrupt processing shift by 2 samples is enough
      to fix this issue. This default value does not harm other,
      non-affected hardware.
      
      More information: Kernel bugzilla bug#16300
      
      [A copmile warning fixed by tiwai]
      Signed-off-by: default avatarJaroslav Kysela <perex@perex.cz>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      56385a12
    • Nick Piggin's avatar
      fs: brlock vfsmount_lock · 99b7db7b
      Nick Piggin authored
      fs: brlock vfsmount_lock
      
      Use a brlock for the vfsmount lock. It must be taken for write whenever
      modifying the mount hash or associated fields, and may be taken for read when
      performing mount hash lookups.
      
      A new lock is added for the mnt-id allocator, so it doesn't need to take
      the heavy vfsmount write-lock.
      
      The number of atomics should remain the same for fastpath rlock cases, though
      code would be slightly slower due to per-cpu access. Scalability is not not be
      much improved in common cases yet, due to other locks (ie. dcache_lock) getting
      in the way. However path lookups crossing mountpoints should be one case where
      scalability is improved (currently requiring the global lock).
      
      The slowpath is slower due to use of brlock. On a 64 core, 64 socket, 32 node
      Altix system (high latency to remote nodes), a simple umount microbenchmark
      (mount --bind mnt mnt2 ; umount mnt2 loop 1000 times), before this patch it
      took 6.8s, afterwards took 7.1s, about 5% slower.
      
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      99b7db7b
    • Nick Piggin's avatar
      fs: scale files_lock · 6416ccb7
      Nick Piggin authored
      fs: scale files_lock
      
      Improve scalability of files_lock by adding per-cpu, per-sb files lists,
      protected with an lglock. The lglock provides fast access to the per-cpu lists
      to add and remove files. It also provides a snapshot of all the per-cpu lists
      (although this is very slow).
      
      One difficulty with this approach is that a file can be removed from the list
      by another CPU. We must track which per-cpu list the file is on with a new
      variale in the file struct (packed into a hole on 64-bit archs). Scalability
      could suffer if files are frequently removed from different cpu's list.
      
      However loads with frequent removal of files imply short interval between
      adding and removing the files, and the scheduler attempts to avoid moving
      processes too far away. Also, even in the case of cross-CPU removal, the
      hardware has much more opportunity to parallelise cacheline transfers with N
      cachelines than with 1.
      
      A worst-case test of 1 CPU allocating files subsequently being freed by N CPUs
      degenerates to contending on a single lock, which is no worse than before. When
      more than one CPU are allocating files, even if they are always freed by
      different CPUs, there will be more parallelism than the single-lock case.
      
      Testing results:
      
      On a 2 socket, 8 core opteron, I measure the number of times the lock is taken
      to remove the file, the number of times it is removed by the same CPU that
      added it, and the number of times it is removed by the same node that added it.
      
      Booting:    locks=  25049 cpu-hits=  23174 (92.5%) node-hits=  23945 (95.6%)
      kbuild -j16 locks=2281913 cpu-hits=2208126 (96.8%) node-hits=2252674 (98.7%)
      dbench 64   locks=4306582 cpu-hits=4287247 (99.6%) node-hits=4299527 (99.8%)
      
      So a file is removed from the same CPU it was added by over 90% of the time.
      It remains within the same node 95% of the time.
      
      Tim Chen ran some numbers for a 64 thread Nehalem system performing a compile.
      
                      throughput
      2.6.34-rc2      24.5
      +patch          24.9
      
                      us      sys     idle    IO wait (in %)
      2.6.34-rc2      51.25   28.25   17.25   3.25
      +patch          53.75   18.5    19      8.75
      
      So significantly less CPU time spent in kernel code, higher idle time and
      slightly higher throughput.
      
      Single threaded performance difference was within the noise of microbenchmarks.
      That is not to say penalty does not exist, the code is larger and more memory
      accesses required so it will be slightly slower.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      6416ccb7
    • Nick Piggin's avatar
      lglock: introduce special lglock and brlock spin locks · 2dc91abe
      Nick Piggin authored
      lglock: introduce special lglock and brlock spin locks
      
      This patch introduces "local-global" locks (lglocks). These can be used to:
      
      - Provide fast exclusive access to per-CPU data, with exclusive access to
        another CPU's data allowed but possibly subject to contention, and to provide
        very slow exclusive access to all per-CPU data.
      - Or to provide very fast and scalable read serialisation, and to provide
        very slow exclusive serialisation of data (not necessarily per-CPU data).
      
      Brlocks are also implemented as a short-hand notation for the latter use
      case.
      
      Thanks to Paul for local/global naming convention.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2dc91abe
    • Nick Piggin's avatar
      tty: fix fu_list abuse · d996b62a
      Nick Piggin authored
      tty: fix fu_list abuse
      
      tty code abuses fu_list, which causes a bug in remount,ro handling.
      
      If a tty device node is opened on a filesystem, then the last link to the inode
      removed, the filesystem will be allowed to be remounted readonly. This is
      because fs_may_remount_ro does not find the 0 link tty inode on the file sb
      list (because the tty code incorrectly removed it to use for its own purpose).
      This can result in a filesystem with errors after it is marked "clean".
      
      Taking idea from Christoph's initial patch, allocate a tty private struct
      at file->private_data and put our required list fields in there, linking
      file and tty. This makes tty nodes behave the same way as other device nodes
      and avoid meddling with the vfs, and avoids this bug.
      
      The error handling is not trivial in the tty code, so for this bugfix, I take
      the simple approach of using __GFP_NOFAIL and don't worry about memory errors.
      This is not a problem because our allocator doesn't fail small allocs as a rule
      anyway. So proper error handling is left as an exercise for tty hackers.
      
      [ Arguably filesystem's device inode would ideally be divorced from the
      driver's pseudo inode when it is opened, but in practice it's not clear whether
      that will ever be worth implementing. ]
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d996b62a
    • Nick Piggin's avatar
      fs: cleanup files_lock locking · ee2ffa0d
      Nick Piggin authored
      fs: cleanup files_lock locking
      
      Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
      manipulate the per-sb files list; unexport the files_lock spinlock.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ee2ffa0d
    • Nick Piggin's avatar
      fs: remove extra lookup in __lookup_hash · b04f784e
      Nick Piggin authored
      fs: remove extra lookup in __lookup_hash
      
      Optimize lookup for create operations, where no dentry should often be
      common-case. In cases where it is not, such as unlink, the added overhead
      is much smaller than the removed.
      
      Also, move comments about __d_lookup racyness to the __d_lookup call site.
      d_lookup is intuitive; __d_lookup is what needs commenting. So in that same
      vein, add kerneldoc comments to __d_lookup and clean up some of the comments:
      
      - We are interested in how the RCU lookup works here, particularly with
        renames. Make that explicit, and point to the document where it is explained
        in more detail.
      - RCU is pretty standard now, and macros make implementations pretty mindless.
        If we want to know about RCU barrier details, we look in RCU code.
      - Delete some boring legacy comments because we don't care much about how the
        code used to work, more about the interesting parts of how it works now. So
        comments about lazy LRU may be interesting, but would better be done in the
        LRU or refcount management code.
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b04f784e
    • Nick Piggin's avatar
      fs: fs_struct rwlock to spinlock · 2a4419b5
      Nick Piggin authored
      fs: fs_struct rwlock to spinlock
      
      struct fs_struct.lock is an rwlock with the read-side used to protect root and
      pwd members while taking references to them. Taking a reference to a path
      typically requires just 2 atomic ops, so the critical section is very small.
      Parallel read-side operations would have cacheline contention on the lock, the
      dentry, and the vfsmount cachelines, so the rwlock is unlikely to ever give a
      real parallelism increase.
      
      Replace it with a spinlock to avoid one or two atomic operations in typical
      path lookup fastpath.
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2a4419b5
    • Nick Piggin's avatar
      apparmor: use task path helpers · 44672e4f
      Nick Piggin authored
      apparmor: use task path helpers
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      44672e4f
    • Nick Piggin's avatar
      fs: dentry allocation consolidation · baa03890
      Nick Piggin authored
      fs: dentry allocation consolidation
      
      There are 2 duplicate copies of code in dentry allocation in path lookup.
      Consolidate them into a single function.
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      baa03890
    • Nick Piggin's avatar
      fs: fix do_lookup false negative · 2e2e88ea
      Nick Piggin authored
      fs: fix do_lookup false negative
      
      In do_lookup, if we initially find no dentry, we take the directory i_mutex and
      re-check the lookup. If we find a dentry there, then we revalidate it if
      needed. However if that revalidate asks for the dentry to be invalidated, we
      return -ENOENT from do_lookup. What should happen instead is an attempt to
      allocate and lookup a new dentry.
      
      This is probably not noticed because it is rare. It is only reached if a
      concurrent create races in first (in which case, the dentry probably won't be
      invalidated anyway), or if the racy __d_lookup has failed due to a
      false-negative (which is very rare).
      
      Fix this by removing code and have it use the normal reval path.
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2e2e88ea
    • Andreas Gruenbacher's avatar
      mbcache: Limit the maximum number of cache entries · 3a48ee8a
      Andreas Gruenbacher authored
      Limit the maximum number of mb_cache entries depending on the number of
      hash buckets: if the only limit to the number of cache entries is the
      available memory the hash chains can grow very long, taking a long time
      to search.
      
      At least partially solves https://bugzilla.lustre.org/show_bug.cgi?id=22771.
      Signed-off-by: default avatarAndreas Gruenbacher <agruen@suse.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3a48ee8a
    • Al Viro's avatar
      hostfs ->follow_link() braino · 3b6036d1
      Al Viro authored
      we want the assignment to err done inside the if () to be
      visible after it, so (re)declaring err inside if () body
      is wrong.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3b6036d1
    • Al Viro's avatar
      hostfs: dumb (and usually harmless) tpyo - strncpy instead of strlcpy · 850a496f
      Al Viro authored
      ... not harmless in this case - we have a string in the end of buffer
      already.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      850a496f
    • Zhang, Yanmin's avatar
      perf, x86: Fix Intel-nhm PMU programming errata workaround · 351af072
      Zhang, Yanmin authored
      Fix the Errata AAK100/AAP53/BD53 workaround, the officialy documented
      workaround we implemented in:
      
       11164cd4: perf, x86: Add Nehelem PMU programming errata workaround
      
      doesn't actually work fully and causes a stuck PMU state
      under load and non-functioning perf profiling.
      
      A functional workaround was found by trial & error.
      
      Affects all Nehalem-class Intel PMUs.
      Signed-off-by: default avatarZhang Yanmin <yanmin_zhang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1281073148.2125.63.camel@ymzhang.sh.intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <stable@kernel.org> # .35.x
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      351af072
    • Ingo Molnar's avatar
      Merge branch 'perf/urgent' of... · 9d5f3714
      Ingo Molnar authored
      Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent
      9d5f3714
    • NeilBrown's avatar
      md raid-1/10 Fix bio_rw bit manipulations again · 2c7d46ec
      NeilBrown authored
      commit 7b6d91da changed the behaviour
      of a few variables in raid1 and raid10 from flags to bit-sets, but
      left them as type 'bool' so they did not work.
      
      Change them (back) to unsigned long.
      (historical note: see 1ef04fef)
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Reported-by: Jiri Slaby <jslaby@suse.cz> and many others
      2c7d46ec
    • Christoph Hellwig's avatar
      remove SWRITE* I/O types · 9cb569d6
      Christoph Hellwig authored
      These flags aren't real I/O types, but tell ll_rw_block to always
      lock the buffer instead of giving up on a failed trylock.
      
      Instead add a new write_dirty_buffer helper that implements this semantic
      and use it from the existing SWRITE* callers.  Note that the ll_rw_block
      code had a bug where it didn't promote WRITE_SYNC_PLUG properly, which
      this patch fixes.
      
      In the ufs code clean up the helper that used to call ll_rw_block
      to mirror sync_dirty_buffer, which is the function it implements for
      compound buffers.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      9cb569d6
    • Christoph Hellwig's avatar
      kill BH_Ordered flag · 87e99511
      Christoph Hellwig authored
      Instead of abusing a buffer_head flag just add a variant of
      sync_dirty_buffer which allows passing the exact type of write
      flag required.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      87e99511
    • Jan Kara's avatar
      vfs: update ctime when changing the file's permission by setfacl · dad5eb6d
      Jan Kara authored
      generic_acl_set didn't update the ctime of the file when its permission was
      changed.
      
      Steps to reproduce:
       # touch aaa
       # stat -c %Z aaa
       1275289822
       # setfacl -m  'u::x,g::x,o::x' aaa
       # stat -c %Z aaa
       1275289822                         <- unchanged
      
      But, according to the spec of the ctime, vfs must update it.
      
      Port of ext3 patch by Miao Xie <miaox@cn.fujitsu.com>.
      
      CC: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      dad5eb6d
    • Alexander Shishkin's avatar
      cramfs: only unlock new inodes · b845ff8f
      Alexander Shishkin authored
      Commit 77b8a75f introduced a warning at fs/inode.c:692 unlock_new_inode(),
      caused by unlock_new_inode() being called on existing inodes as well.
      
      This patch changes setup_inode() to only call unlock_new_inode() for I_NEW
      inodes.
      Signed-off-by: default avatarAlexander Shishkin <virtuoso@slind.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b845ff8f