1. 03 Apr, 2009 40 commits
    • Fenghua Yu's avatar
      Intel IOMMU Suspend/Resume Support - Queued Invalidation · eb4a52bc
      Fenghua Yu authored
      This patch supports queued invalidation suspend/resume.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      eb4a52bc
    • Fenghua Yu's avatar
      Intel IOMMU Suspend/Resume Support - DMAR · f59c7b69
      Fenghua Yu authored
      This patch implements the suspend and resume feature for Intel IOMMU
      DMAR. It hooks to kernel suspend and resume interface. When suspend happens, it
      saves necessary hardware registers. When resume happens, it restores the
      registers and restarts IOMMU by enabling translation, setting up root entry, and
      re-enabling queued invalidation.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      f59c7b69
    • David Woodhouse's avatar
    • Linus Torvalds's avatar
      Merge branch 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 20bec8ab
      Linus Torvalds authored
      * 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext3: Add replace-on-rename hueristics for data=writeback mode
        ext3: Add replace-on-truncate hueristics for data=writeback mode
        ext3: Use WRITE_SYNC for commits which are caused by fsync()
        block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks
      20bec8ab
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 · 18b34b95
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: (32 commits)
        regulator: twl4030 VAUX3 supports 3.0V
        regulator: Support disabling of unused regulators by machines
        regulator: Don't increment use_count for boot_on regulators
        twl4030-regulator: expose VPLL2
        regulator: refcount fixes
        regulator: Don't warn if we failed to get a regulator
        regulator: Allow boot_on regulators to be disabled by clients
        regulator: Implement list_voltage for WM835x LDOs and DCDCs
        twl4030-regulator: list more VAUX4 voltages
        regulator: Don't warn on omitted voltage constraints
        regulator: Implement list_voltage() for WM8400 DCDCs and LDOs
        MMC: regulator utilities
        regulator: twl4030 voltage enumeration (v2)
        regulator: twl4030 regulators
        regulator: get_status() grows kerneldoc
        regulator: enumerate voltages (v2)
        regulator: Fix get_mode() for WM835x DCDCs
        regulator: Allow regulators to set the initial operating mode
        regulator: Suggest use of datasheet supply or pin names for consumers
        regulator: email - update email address and regulator webpage.
        ...
      18b34b95
    • Linus Torvalds's avatar
      Merge git://git.infradead.org/iommu-2.6 · ca1ee219
      Linus Torvalds authored
      * git://git.infradead.org/iommu-2.6:
        intel-iommu: Fix address wrap on 32-bit kernel.
        intel-iommu: Enable DMAR on 32-bit kernel.
        intel-iommu: fix PCI device detach from virtual machine
        intel-iommu: VT-d page table to support snooping control bit
        iommu: Add domain_has_cap iommu_ops
        intel-iommu: Snooping control support
      
      Fixed trivial conflicts in arch/x86/Kconfig and drivers/pci/intel-iommu.c
      ca1ee219
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache · 3cc50ac0
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (41 commits)
        NFS: Add mount options to enable local caching on NFS
        NFS: Display local caching state
        NFS: Store pages from an NFS inode into a local cache
        NFS: Read pages from FS-Cache into an NFS inode
        NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching
        NFS: Add read context retention for FS-Cache to call back with
        NFS: FS-Cache page management
        NFS: Add some new I/O counters for FS-Cache doing things for NFS
        NFS: Invalidate FsCache page flags when cache removed
        NFS: Use local disk inode cache
        NFS: Define and create inode-level cache objects
        NFS: Define and create superblock-level objects
        NFS: Define and create server-level objects
        NFS: Register NFS for caching and retrieve the top-level index
        NFS: Permit local filesystem caching to be enabled for NFS
        NFS: Add FS-Cache option bit and debug bit
        NFS: Add comment banners to some NFS functions
        FS-Cache: Make kAFS use FS-Cache
        CacheFiles: A cache that backs onto a mounted filesystem
        CacheFiles: Export things for CacheFiles
        ...
      3cc50ac0
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm · d9b9be02
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (36 commits)
        dm: set queue ordered mode
        dm: move wait queue declaration
        dm: merge pushback and deferred bio lists
        dm: allow uninterruptible wait for pending io
        dm: merge __flush_deferred_io into caller
        dm: move bio_io_error into __split_and_process_bio
        dm: rename __split_bio
        dm: remove unnecessary struct dm_wq_req
        dm: remove unnecessary work queue context field
        dm: remove unnecessary work queue type field
        dm: bio list add bio_list_add_head
        dm snapshot: persistent fix dtr cleanup
        dm snapshot: move status to exception store
        dm snapshot: move ctr parsing to exception store
        dm snapshot: use DMEMIT macro for status
        dm snapshot: remove dm_snap header
        dm snapshot: remove dm_snap header use
        dm exception store: move cow pointer
        dm exception store: move chunk_fields
        dm exception store: move dm_target pointer
        ...
      d9b9be02
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · 9b59f031
      Linus Torvalds authored
      * 'for-linus' of git://git.open-osd.org/linux-open-osd:
        fs: Add exofs to Kernel build
        exofs: Documentation
        exofs: export_operations
        exofs: super_operations and file_system_type
        exofs: dir_inode and directory operations
        exofs: address_space_operations
        exofs: symlink_inode and fast_symlink_inode operations
        exofs: file and file_inode operations
        exofs: Kbuild, Headers and osd utils
      9b59f031
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · ac7c1a77
      Linus Torvalds authored
      * 'for-linus' of git://oss.sgi.com/xfs/xfs: (61 commits)
        Revert "xfs: increase the maximum number of supported ACL entries"
        xfs: cleanup uuid handling
        xfs: remove m_attroffset
        xfs: fix various typos
        xfs: pagecache usage optimization
        xfs: remove m_litino
        xfs: kill ino64 mount option
        xfs: kill mutex_t typedef
        xfs: increase the maximum number of supported ACL entries
        xfs: factor out code to find the longest free extent in the AG
        xfs: kill VN_BAD
        xfs: kill vn_atime_* helpers.
        xfs: cleanup xlog_bread
        xfs: cleanup xlog_recover_do_trans
        xfs: remove another leftover of the old inode log item format
        xfs: cleanup log unmount handling
        Fix xfs debug build breakage by pushing xfs_error.h after
        xfs: include header files for prototypes
        xfs: make symbols static
        xfs: move declaration to header file
        ...
      ac7c1a77
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 · 3ba113d1
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6: (23 commits)
        parisc: move dereference_function_descriptor to process.c
        parisc: Move kernel Elf_Fdesc define to <asm/elf.h>
        parisc: fix build when ARCH_HAS_KMAP
        parisc: fix "make tar-pkg"
        parisc: drivers: fix warnings
        parisc: select BUG always
        parisc: asm/pdc.h should include asm/page.h
        parisc: led: remove proc_dir_entry::owner
        parisc: fix macro expansion in atomic.h
        parisc: iosapic: fix build breakage
        parisc: oops_enter()/oops_exit() in die()
        parisc: document light weight syscall ABI
        parisc: blink all or loadavg LEDs on oops
        parisc: add ftrace (function and graph tracer) functionality
        parisc: simplify sys_clone()
        parisc: add LATENCYTOP_SUPPORT and CONFIG_STACKTRACE_SUPPORT
        parisc: allow to build with 16k default kernel page size
        parisc: expose 32/64-bit capabilities in cpuinfo
        parisc: use constants instead of numbers in assembly
        parisc: fix usage of 32bit PTE page table entries on 32bit kernels
        ...
      3ba113d1
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-parisc · bad6a5c0
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-parisc:
        powerpc/ps3: Add rtc-ps3
        powerpc: Hook up rtc-generic, and kill rtc-ppc
        m68k: Hook up rtc-generic
        parisc: rtc: Rename rtc-parisc to rtc-generic
        parisc: rtc: Add missing module alias
        parisc: rtc: platform_driver_probe() fixups
        parisc: rtc: get_rtc_time() returns unsigned int
      bad6a5c0
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 · 03c3fa0a
      Linus Torvalds authored
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
        udf: Don't write integrity descriptor too often
        udf: Try anchor in block 256 first
        udf: Some type fixes and cleanups
        udf: use hardware sector size
        udf: fix novrs mount option
        udf: Fix oops when invalid character in filename occurs
        udf: return f_fsid for statfs(2)
        udf: Add checks to not underflow sector_t
        udf: fix default mode and dmode options handling
        udf: fix sparse warnings:
        udf: unsigned last[i] cannot be less than 0
        udf: implement mode and dmode mounting options
        udf: reduce stack usage of udf_get_filename
        udf: reduce stack usage of udf_load_pvoldesc
        Fix the udf code not to pass structs on stack where possible.
        Remove struct typedefs from fs/udf/ecma_167.h et al.
      03c3fa0a
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/rcu-doc-2.6 · 3e850509
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/rcu-doc-2.6:
        Doc: Fix spelling in RCU/rculist_nulls.txt.
        Doc: Fix wrong API example usage of call_rcu().
        Doc: Fix missing whitespaces in RCU documentation.
      3e850509
    • Akinobu Mita's avatar
      mm: fix misuse of debug_kmap_atomic · a0e0404f
      Akinobu Mita authored
      Commit 7ca43e75 ("mm: use debug_kmap_atomic")
      introduced some debug_kmap_atomic() in wrong places.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0e0404f
    • Kumar Gala's avatar
      Fix highmem PPC build failure · 3688e07f
      Kumar Gala authored
      Commit f4112de6 ("mm: introduce
      debug_kmap_atomic") broke PPC builds with CONFIG_HIGHMEM=y:
      
         CC      init/main.o
        In file included from include/linux/highmem.h:25,
                         from include/linux/pagemap.h:11,
                         from include/linux/mempolicy.h:63,
                         from init/main.c:53:
        arch/powerpc/include/asm/highmem.h: In function 'kmap_atomic_prot':
        arch/powerpc/include/asm/highmem.h:98: error: implicit declaration of function 'debug_kmap_atomic'
        In file included from include/linux/pagemap.h:11,
                         from include/linux/mempolicy.h:63,
                         from init/main.c:53:
        include/linux/highmem.h: At top level:
        include/linux/highmem.h:196: warning: conflicting types for 'debug_kmap_atomic'
        include/linux/highmem.h:196: error: static declaration of 'debug_kmap_atomic' follows non-static declaration
        include/asm/highmem.h:98: error: previous implicit declaration of 'debug_kmap_atomic' was here
        make[1]: *** [init/main.o] Error 1
        make: *** [init] Error 2
      Signed-off-by: default avatarKumar Gala <galak@kernel.crashing.org>
      Acked-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3688e07f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · c54c4dec
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: ixp4xx - Fix handling of chained sg buffers
        crypto: shash - Fix unaligned calculation with short length
        hwrng: timeriomem - Use phys address rather than virt
      c54c4dec
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · 5de1ccbe
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: (41 commits)
        m68knommu: improve compile arch switch settings
        m68knommu: fix 5407 ColdFire UART vector setup
        m68knommu: fix 5307 ColdFire UART vector setup
        m68knommu: fix 5249 ColdFire UART vector setup
        m68knommu: fix 5249 ColdFire UART setup
        m68knommu: fix end of uart table marker
        m68knommu: switch to using generic_handle_irq()
        m68k: merge the mmu and non-mmu versions of tlbflush.h
        m68knommu: introduce basic clk infrastructure
        m68k: merge the mmu and non-mmu versions of module.h
        m68knommu: add missing interrupt line definition for UART 2
        m68k: merge the mmu and non-mmu versions of mmu_context.h
        m68k: merge the mmu and non-mmu versions of current.h
        m68k: merge the mmu and non-mmu versions of div64.h
        m68k: merge the mmu and non-mmu versions of bugs.h
        m68k: merge the mmu and non-mmu versions of bug.h
        m68k: use the mmu version of cache.h for m68knommu as well
        m68k: use the mmu version of bootinfo.h for m68knommu as well
        m68k: merge the mmu and non-mmu versions of fb.h
        m68k: merge the mmu and non-mmu versions of segment.h
        ...
      5de1ccbe
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://neil.brown.name/md · 223cdea4
      Linus Torvalds authored
      * 'for-linus' of git://neil.brown.name/md: (53 commits)
        md/raid5 revise rules for when to update metadata during reshape
        md/raid5: minor code cleanups in make_request.
        md: remove CONFIG_MD_RAID_RESHAPE config option.
        md/raid5: be more careful about write ordering when reshaping.
        md: don't display meaningless values in sysfs files resync_start and sync_speed
        md/raid5: allow layout and chunksize to be changed on active array.
        md/raid5: reshape using largest of old and new chunk size
        md/raid5: prepare for allowing reshape to change layout
        md/raid5: prepare for allowing reshape to change chunksize.
        md/raid5: clearly differentiate 'before' and 'after' stripes during reshape.
        Documentation/md.txt update
        md: allow number of drives in raid5 to be reduced
        md/raid5: change reshape-progress measurement to cope with reshaping backwards.
        md: add explicit method to signal the end of a reshape.
        md/raid5: enhance raid5_size to work correctly with negative delta_disks
        md/raid5: drop qd_idx from r6_state
        md/raid6: move raid6 data processing to raid6_pq.ko
        md: raid5 run(): Fix max_degraded for raid level 4.
        md: 'array_size' sysfs attribute
        md: centralize ->array_sectors modifications
        ...
      223cdea4
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/rmk/linux-2.6-arm · 31e6e2da
      Linus Torvalds authored
      * master.kernel.org:/home/rmk/linux-2.6-arm:
        [ARM] fix build-breaking 7a192ec3 commit
        ARM: Add SMSC911X support to Overo platform (V2)
        arm: update omap_ldp defconfig to use smsc911x
        arm: update realview defconfigs to use smsc911x
        arm: update pcm037 defconfig to use smsc911x
        arm: convert omap ldp platform to use smsc911x
        arm: convert realview platform to use smsc911x
        arm: convert pcm037 platform to use smsc911x
        [ARM] 5444/1: ARM: Realview: Fix event-device multiplicators in localtimer.c
        [ARM] 5442/1: pxa/cm-x255: fix reverse RDY gpios in PCMCIA driver
        [ARM] 5441/1: Use pr_err on error paths in at91 pm
        [ARM] 5440/1: Fix VFP state corruption due to preemption during VFP exceptions
        [ARM] 5439/1: Do not clear bit 10 of DFSR during abort handling on ARMv6
        [ARM] 5437/1: Add documentation for "nohlt" kernel parameter
        [ARM] 5436/1: ARM: OMAP: Fix compile for rx51
        [ARM] arch_reset() now takes a second parameter
        [ARM] Kirkwood: small L2 code cleanup
        [ARM] Kirkwood: invalidate L2 cache before enabling it
      31e6e2da
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/linux-hdreg-h-cleanup · ea02259f
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/bart/linux-hdreg-h-cleanup:
        remove <linux/ata.h> include from <linux/hdreg.h>
        include/linux/hdreg.h: remove unused defines
        isd200: use ATA_* defines instead of *_STAT and *_ERR ones
        include/linux/hdreg.h: cover WIN_* and friends with #ifndef/#endif __KERNEL__
        aoe: WIN_* -> ATA_CMD_*
        isd200: WIN_* -> ATA_CMD_*
        include/linux/hdreg.h: cover struct hd_driveid with #ifndef/#endif __KERNEL__
        xsysace: make it 'struct hd_driveid'-free
        ubd_kern: make it 'struct hd_driveid'-free
        isd200: make it 'struct hd_driveid'-free
      ea02259f
    • David Howells's avatar
      NFS: Add mount options to enable local caching on NFS · b797cac7
      David Howells authored
      Add NFS mount options to allow the local caching support to be enabled.
      
      The attached patch makes it possible for the NFS filesystem to be told to make
      use of the network filesystem local caching service (FS-Cache).
      
      To be able to use this, a recent nfsutils package is required.
      
      There are three variant NFS mount options that can be added to a mount command
      to control caching for a mount.  Only the last one specified takes effect:
      
       (*) Adding "fsc" will request caching.
      
       (*) Adding "fsc=<string>" will request caching and also specify a uniquifier.
      
       (*) Adding "nofsc" will disable caching.
      
      For example:
      
      	mount warthog:/ /a -o fsc
      
      The cache of a particular superblock (NFS FSID) will be shared between all
      mounts of that volume, provided they have the same connection parameters and
      are not marked 'nosharecache'.
      
      Where it is otherwise impossible to distinguish superblocks because all the
      parameters are identical, but the 'nosharecache' option is supplied, a
      uniquifying string must be supplied, else only the first mount will be
      permitted to use the cache.
      
      If there's a key collision, then the second mount will disable caching and give
      a warning into the kernel log.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      b797cac7
    • David Howells's avatar
      NFS: Display local caching state · 5d1acff1
      David Howells authored
      Display the local caching state in /proc/fs/nfsfs/volumes.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      5d1acff1
    • David Howells's avatar
      NFS: Store pages from an NFS inode into a local cache · 7f8e05f6
      David Howells authored
      Store pages from an NFS inode into the cache data storage object associated
      with that inode.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      7f8e05f6
    • David Howells's avatar
      NFS: Read pages from FS-Cache into an NFS inode · 9a9fc1c0
      David Howells authored
      Read pages from an FS-Cache data storage object representing an inode into an
      NFS inode.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      9a9fc1c0
    • David Howells's avatar
      NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching · f42b293d
      David Howells authored
      nfs_readpage_async() needs to be non-static so that it can be used as a
      fallback for the local on-disk caching should an EIO crop up when reading the
      cache.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      f42b293d
    • David Howells's avatar
      NFS: Add read context retention for FS-Cache to call back with · 1fcdf534
      David Howells authored
      Add read context retention so that FS-Cache can call back into NFS when a read
      operation on the cache fails EIO rather than reading data.  This permits NFS to
      then fetch the data from the server instead using the appropriate security
      context.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      1fcdf534
    • David Howells's avatar
      NFS: FS-Cache page management · 545db45f
      David Howells authored
      FS-Cache page management for NFS.  This includes hooking the releasing and
      invalidation of pages marked with PG_fscache (aka PG_private_2) and waiting for
      completion of the write-to-cache flag (PG_fscache_write aka PG_owner_priv_2).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      545db45f
    • David Howells's avatar
      NFS: Add some new I/O counters for FS-Cache doing things for NFS · 6a51091d
      David Howells authored
      Add some new NFS I/O counters for FS-Cache doing things for NFS.  A new line is
      emitted into /proc/pid/mountstats if caching is enabled that looks like:
      
      	fsc: <rok> <rfl> <wok> <wfl> <unc>
      
      Where <rok> is the number of pages read successfully from the cache, <rfl> is
      the number of failed page reads against the cache, <wok> is the number of
      successful page writes to the cache, <wfl> is the number of failed page writes
      to the cache, and <unc> is the number of NFS pages that have been disconnected
      from the cache.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      6a51091d
    • David Howells's avatar
      NFS: Invalidate FsCache page flags when cache removed · d599064a
      David Howells authored
      Invalidate the FsCache page flags on the pages belonging to an inode when the
      cache backing that NFS inode is removed.
      
      This allows a live cache to be withdrawn.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      d599064a
    • David Howells's avatar
      NFS: Use local disk inode cache · ef79c097
      David Howells authored
      Bind data storage objects in the local cache to NFS inodes.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      ef79c097
    • David Howells's avatar
      NFS: Define and create inode-level cache objects · 10329a5d
      David Howells authored
      Define and create inode-level cache data storage objects (as managed by
      nfs_inode structs).
      
      Each inode-level object is created in a superblock-level index object and is
      itself a data storage object into which pages from the inode are stored.
      
      The inode object key is the NFS file handle for the inode.
      
      The inode object is given coherency data to carry in the auxiliary data
      permitted by the cache.  This is a sequence made up of:
      
       (1) i_mtime from the NFS inode.
      
       (2) i_ctime from the NFS inode.
      
       (3) i_size from the NFS inode.
      
       (4) change_attr from the NFSv4 attribute data.
      
      As the cache is a persistent cache, the auxiliary data is checked when a new
      NFS in-memory inode is set up that matches an already existing data storage
      object in the cache.  If the coherency data is the same, the on-disk object is
      retained and used; if not, it is scrapped and a new one created.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      10329a5d
    • David Howells's avatar
      NFS: Define and create superblock-level objects · 08734048
      David Howells authored
      Define and create superblock-level cache index objects (as managed by
      nfs_server structs).
      
      Each superblock object is created in a server level index object and is itself
      an index into which inode-level objects are inserted.
      
      Ideally there would be one superblock-level object per server, and the former
      would be folded into the latter; however, since the "nosharecache" option
      exists this isn't possible.
      
      The superblock object key is a sequence consisting of:
      
       (1) Certain superblock s_flags.
      
       (2) Various connection parameters that serve to distinguish superblocks for
           sget().
      
       (3) The volume FSID.
      
       (4) The security flavour.
      
       (5) The uniquifier length.
      
       (6) The uniquifier text.  This is normally an empty string, unless the fsc=xyz
           mount option was used to explicitly specify a uniquifier.
      
      The key blob is of variable length, depending on the length of (6).
      
      The superblock object is given no coherency data to carry in the auxiliary data
      permitted by the cache.  It is assumed that the superblock is always coherent.
      
      This patch also adds uniquification handling such that two otherwise identical
      superblocks, at least one of which is marked "nosharecache", won't end up
      trying to share the on-disk cache.  It will be possible to manually provide a
      uniquifier through a mount option with a later patch to avoid the error
      otherwise produced.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      08734048
    • David Howells's avatar
      NFS: Define and create server-level objects · 14727281
      David Howells authored
      Define and create server-level cache index objects (as managed by nfs_client
      structs).
      
      Each server object is created in the NFS top-level index object and is itself
      an index into which superblock-level objects are inserted.
      
      Ideally there would be one superblock-level object per server, and the former
      would be folded into the latter; however, since the "nosharecache" option
      exists this isn't possible.
      
      The server object key is a sequence consisting of:
      
       (1) NFS version
      
       (2) Server address family (eg: AF_INET or AF_INET6)
      
       (3) Server port.
      
       (4) Server IP address.
      
      The key blob is of variable length, depending on the length of (4).
      
      The server object is given no coherency data to carry in the auxiliary data
      permitted by the cache.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      14727281
    • David Howells's avatar
      NFS: Register NFS for caching and retrieve the top-level index · 8ec442ae
      David Howells authored
      Register NFS for caching and retrieve the top-level cache index object cookie.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      8ec442ae
    • David Howells's avatar
      NFS: Permit local filesystem caching to be enabled for NFS · 3b9ce977
      David Howells authored
      Permit local filesystem caching to be enabled for NFS in the kernel
      configuration.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      3b9ce977
    • David Howells's avatar
      NFS: Add FS-Cache option bit and debug bit · c6a6f19e
      David Howells authored
      Add FS-Cache option bit to nfs_server struct.  This is set to indicate local
      on-disk caching is enabled for a particular superblock.
      
      Also add debug bit for local caching operations.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      c6a6f19e
    • David Howells's avatar
      NFS: Add comment banners to some NFS functions · 6b9b3514
      David Howells authored
      Add comment banners to some NFS functions so that they can be modified by the
      NFS fscache patches for further information.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      6b9b3514
    • David Howells's avatar
      FS-Cache: Make kAFS use FS-Cache · 9b3f26c9
      David Howells authored
      The attached patch makes the kAFS filesystem in fs/afs/ use FS-Cache, and
      through it any attached caches.  The kAFS filesystem will use caching
      automatically if it's available.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      9b3f26c9
    • David Howells's avatar
      CacheFiles: A cache that backs onto a mounted filesystem · 9ae326a6
      David Howells authored
      Add an FS-Cache cache-backend that permits a mounted filesystem to be used as a
      backing store for the cache.
      
      CacheFiles uses a userspace daemon to do some of the cache management - such as
      reaping stale nodes and culling.  This is called cachefilesd and lives in
      /sbin.  The source for the daemon can be downloaded from:
      
      	http://people.redhat.com/~dhowells/cachefs/cachefilesd.c
      
      And an example configuration from:
      
      	http://people.redhat.com/~dhowells/cachefs/cachefilesd.conf
      
      The filesystem and data integrity of the cache are only as good as those of the
      filesystem providing the backing services.  Note that CacheFiles does not
      attempt to journal anything since the journalling interfaces of the various
      filesystems are very specific in nature.
      
      CacheFiles creates a misc character device - "/dev/cachefiles" - that is used
      to communication with the daemon.  Only one thing may have this open at once,
      and whilst it is open, a cache is at least partially in existence.  The daemon
      opens this and sends commands down it to control the cache.
      
      CacheFiles is currently limited to a single cache.
      
      CacheFiles attempts to maintain at least a certain percentage of free space on
      the filesystem, shrinking the cache by culling the objects it contains to make
      space if necessary - see the "Cache Culling" section.  This means it can be
      placed on the same medium as a live set of data, and will expand to make use of
      spare space and automatically contract when the set of data requires more
      space.
      
      ============
      REQUIREMENTS
      ============
      
      The use of CacheFiles and its daemon requires the following features to be
      available in the system and in the cache filesystem:
      
      	- dnotify.
      
      	- extended attributes (xattrs).
      
      	- openat() and friends.
      
      	- bmap() support on files in the filesystem (FIBMAP ioctl).
      
      	- The use of bmap() to detect a partial page at the end of the file.
      
      It is strongly recommended that the "dir_index" option is enabled on Ext3
      filesystems being used as a cache.
      
      =============
      CONFIGURATION
      =============
      
      The cache is configured by a script in /etc/cachefilesd.conf.  These commands
      set up cache ready for use.  The following script commands are available:
      
       (*) brun <N>%
       (*) bcull <N>%
       (*) bstop <N>%
       (*) frun <N>%
       (*) fcull <N>%
       (*) fstop <N>%
      
      	Configure the culling limits.  Optional.  See the section on culling
      	The defaults are 7% (run), 5% (cull) and 1% (stop) respectively.
      
      	The commands beginning with a 'b' are file space (block) limits, those
      	beginning with an 'f' are file count limits.
      
       (*) dir <path>
      
      	Specify the directory containing the root of the cache.  Mandatory.
      
       (*) tag <name>
      
      	Specify a tag to FS-Cache to use in distinguishing multiple caches.
      	Optional.  The default is "CacheFiles".
      
       (*) debug <mask>
      
      	Specify a numeric bitmask to control debugging in the kernel module.
      	Optional.  The default is zero (all off).  The following values can be
      	OR'd into the mask to collect various information:
      
      		1	Turn on trace of function entry (_enter() macros)
      		2	Turn on trace of function exit (_leave() macros)
      		4	Turn on trace of internal debug points (_debug())
      
      	This mask can also be set through sysfs, eg:
      
      		echo 5 >/sys/modules/cachefiles/parameters/debug
      
      ==================
      STARTING THE CACHE
      ==================
      
      The cache is started by running the daemon.  The daemon opens the cache device,
      configures the cache and tells it to begin caching.  At that point the cache
      binds to fscache and the cache becomes live.
      
      The daemon is run as follows:
      
      	/sbin/cachefilesd [-d]* [-s] [-n] [-f <configfile>]
      
      The flags are:
      
       (*) -d
      
      	Increase the debugging level.  This can be specified multiple times and
      	is cumulative with itself.
      
       (*) -s
      
      	Send messages to stderr instead of syslog.
      
       (*) -n
      
      	Don't daemonise and go into background.
      
       (*) -f <configfile>
      
      	Use an alternative configuration file rather than the default one.
      
      ===============
      THINGS TO AVOID
      ===============
      
      Do not mount other things within the cache as this will cause problems.  The
      kernel module contains its own very cut-down path walking facility that ignores
      mountpoints, but the daemon can't avoid them.
      
      Do not create, rename or unlink files and directories in the cache whilst the
      cache is active, as this may cause the state to become uncertain.
      
      Renaming files in the cache might make objects appear to be other objects (the
      filename is part of the lookup key).
      
      Do not change or remove the extended attributes attached to cache files by the
      cache as this will cause the cache state management to get confused.
      
      Do not create files or directories in the cache, lest the cache get confused or
      serve incorrect data.
      
      Do not chmod files in the cache.  The module creates things with minimal
      permissions to prevent random users being able to access them directly.
      
      =============
      CACHE CULLING
      =============
      
      The cache may need culling occasionally to make space.  This involves
      discarding objects from the cache that have been used less recently than
      anything else.  Culling is based on the access time of data objects.  Empty
      directories are culled if not in use.
      
      Cache culling is done on the basis of the percentage of blocks and the
      percentage of files available in the underlying filesystem.  There are six
      "limits":
      
       (*) brun
       (*) frun
      
           If the amount of free space and the number of available files in the cache
           rises above both these limits, then culling is turned off.
      
       (*) bcull
       (*) fcull
      
           If the amount of available space or the number of available files in the
           cache falls below either of these limits, then culling is started.
      
       (*) bstop
       (*) fstop
      
           If the amount of available space or the number of available files in the
           cache falls below either of these limits, then no further allocation of
           disk space or files is permitted until culling has raised things above
           these limits again.
      
      These must be configured thusly:
      
      	0 <= bstop < bcull < brun < 100
      	0 <= fstop < fcull < frun < 100
      
      Note that these are percentages of available space and available files, and do
      _not_ appear as 100 minus the percentage displayed by the "df" program.
      
      The userspace daemon scans the cache to build up a table of cullable objects.
      These are then culled in least recently used order.  A new scan of the cache is
      started as soon as space is made in the table.  Objects will be skipped if
      their atimes have changed or if the kernel module says it is still using them.
      
      ===============
      CACHE STRUCTURE
      ===============
      
      The CacheFiles module will create two directories in the directory it was
      given:
      
       (*) cache/
      
       (*) graveyard/
      
      The active cache objects all reside in the first directory.  The CacheFiles
      kernel module moves any retired or culled objects that it can't simply unlink
      to the graveyard from which the daemon will actually delete them.
      
      The daemon uses dnotify to monitor the graveyard directory, and will delete
      anything that appears therein.
      
      The module represents index objects as directories with the filename "I..." or
      "J...".  Note that the "cache/" directory is itself a special index.
      
      Data objects are represented as files if they have no children, or directories
      if they do.  Their filenames all begin "D..." or "E...".  If represented as a
      directory, data objects will have a file in the directory called "data" that
      actually holds the data.
      
      Special objects are similar to data objects, except their filenames begin
      "S..." or "T...".
      
      If an object has children, then it will be represented as a directory.
      Immediately in the representative directory are a collection of directories
      named for hash values of the child object keys with an '@' prepended.  Into
      this directory, if possible, will be placed the representations of the child
      objects:
      
      	INDEX     INDEX      INDEX                             DATA FILES
      	========= ========== ================================= ================
      	cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400
      	cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...DB1ry
      	cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...N22ry
      	cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...FP1ry
      
      If the key is so long that it exceeds NAME_MAX with the decorations added on to
      it, then it will be cut into pieces, the first few of which will be used to
      make a nest of directories, and the last one of which will be the objects
      inside the last directory.  The names of the intermediate directories will have
      '+' prepended:
      
      	J1223/@23/+xy...z/+kl...m/Epqr
      
      Note that keys are raw data, and not only may they exceed NAME_MAX in size,
      they may also contain things like '/' and NUL characters, and so they may not
      be suitable for turning directly into a filename.
      
      To handle this, CacheFiles will use a suitably printable filename directly and
      "base-64" encode ones that aren't directly suitable.  The two versions of
      object filenames indicate the encoding:
      
      	OBJECT TYPE	PRINTABLE	ENCODED
      	===============	===============	===============
      	Index		"I..."		"J..."
      	Data		"D..."		"E..."
      	Special		"S..."		"T..."
      
      Intermediate directories are always "@" or "+" as appropriate.
      
      Each object in the cache has an extended attribute label that holds the object
      type ID (required to distinguish special objects) and the auxiliary data from
      the netfs.  The latter is used to detect stale objects in the cache and update
      or retire them.
      
      Note that CacheFiles will erase from the cache any file it doesn't recognise or
      any file of an incorrect type (such as a FIFO file or a device file).
      
      ==========================
      SECURITY MODEL AND SELINUX
      ==========================
      
      CacheFiles is implemented to deal properly with the LSM security features of
      the Linux kernel and the SELinux facility.
      
      One of the problems that CacheFiles faces is that it is generally acting on
      behalf of a process, and running in that process's context, and that includes a
      security context that is not appropriate for accessing the cache - either
      because the files in the cache are inaccessible to that process, or because if
      the process creates a file in the cache, that file may be inaccessible to other
      processes.
      
      The way CacheFiles works is to temporarily change the security context (fsuid,
      fsgid and actor security label) that the process acts as - without changing the
      security context of the process when it the target of an operation performed by
      some other process (so signalling and suchlike still work correctly).
      
      When the CacheFiles module is asked to bind to its cache, it:
      
       (1) Finds the security label attached to the root cache directory and uses
           that as the security label with which it will create files.  By default,
           this is:
      
      	cachefiles_var_t
      
       (2) Finds the security label of the process which issued the bind request
           (presumed to be the cachefilesd daemon), which by default will be:
      
      	cachefilesd_t
      
           and asks LSM to supply a security ID as which it should act given the
           daemon's label.  By default, this will be:
      
      	cachefiles_kernel_t
      
           SELinux transitions the daemon's security ID to the module's security ID
           based on a rule of this form in the policy.
      
      	type_transition <daemon's-ID> kernel_t : process <module's-ID>;
      
           For instance:
      
      	type_transition cachefilesd_t kernel_t : process cachefiles_kernel_t;
      
      The module's security ID gives it permission to create, move and remove files
      and directories in the cache, to find and access directories and files in the
      cache, to set and access extended attributes on cache objects, and to read and
      write files in the cache.
      
      The daemon's security ID gives it only a very restricted set of permissions: it
      may scan directories, stat files and erase files and directories.  It may
      not read or write files in the cache, and so it is precluded from accessing the
      data cached therein; nor is it permitted to create new files in the cache.
      
      There are policy source files available in:
      
      	http://people.redhat.com/~dhowells/fscache/cachefilesd-0.8.tar.bz2
      
      and later versions.  In that tarball, see the files:
      
      	cachefilesd.te
      	cachefilesd.fc
      	cachefilesd.if
      
      They are built and installed directly by the RPM.
      
      If a non-RPM based system is being used, then copy the above files to their own
      directory and run:
      
      	make -f /usr/share/selinux/devel/Makefile
      	semodule -i cachefilesd.pp
      
      You will need checkpolicy and selinux-policy-devel installed prior to the
      build.
      
      By default, the cache is located in /var/fscache, but if it is desirable that
      it should be elsewhere, than either the above policy files must be altered, or
      an auxiliary policy must be installed to label the alternate location of the
      cache.
      
      For instructions on how to add an auxiliary policy to enable the cache to be
      located elsewhere when SELinux is in enforcing mode, please see:
      
      	/usr/share/doc/cachefilesd-*/move-cache.txt
      
      When the cachefilesd rpm is installed; alternatively, the document can be found
      in the sources.
      
      ==================
      A NOTE ON SECURITY
      ==================
      
      CacheFiles makes use of the split security in the task_struct.  It allocates
      its own task_security structure, and redirects current->act_as to point to it
      when it acts on behalf of another process, in that process's context.
      
      The reason it does this is that it calls vfs_mkdir() and suchlike rather than
      bypassing security and calling inode ops directly.  Therefore the VFS and LSM
      may deny the CacheFiles access to the cache data because under some
      circumstances the caching code is running in the security context of whatever
      process issued the original syscall on the netfs.
      
      Furthermore, should CacheFiles create a file or directory, the security
      parameters with that object is created (UID, GID, security label) would be
      derived from that process that issued the system call, thus potentially
      preventing other processes from accessing the cache - including CacheFiles's
      cache management daemon (cachefilesd).
      
      What is required is to temporarily override the security of the process that
      issued the system call.  We can't, however, just do an in-place change of the
      security data as that affects the process as an object, not just as a subject.
      This means it may lose signals or ptrace events for example, and affects what
      the process looks like in /proc.
      
      So CacheFiles makes use of a logical split in the security between the
      objective security (task->sec) and the subjective security (task->act_as).  The
      objective security holds the intrinsic security properties of a process and is
      never overridden.  This is what appears in /proc, and is what is used when a
      process is the target of an operation by some other process (SIGKILL for
      example).
      
      The subjective security holds the active security properties of a process, and
      may be overridden.  This is not seen externally, and is used whan a process
      acts upon another object, for example SIGKILLing another process or opening a
      file.
      
      LSM hooks exist that allow SELinux (or Smack or whatever) to reject a request
      for CacheFiles to run in a context of a specific security label, or to create
      files and directories with another security label.
      
      This documentation is added by the patch to:
      
      	Documentation/filesystems/caching/cachefiles.txt
      Signed-Off-By: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: default avatarDaire Byrne <Daire.Byrne@framestore.com>
      9ae326a6