1. 01 Jun, 2012 40 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 1193755a
      Linus Torvalds authored
      Pull vfs changes from Al Viro.
       "A lot of misc stuff.  The obvious groups:
         * Miklos' atomic_open series; kills the damn abuse of
           ->d_revalidate() by NFS, which was the major stumbling block for
           all work in that area.
         * ripping security_file_mmap() and dealing with deadlocks in the
           area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
           general.
         * ->encode_fh() switched to saner API; insane fake dentry in
           mm/cleancache.c gone.
         * assorted annotations in fs (endianness, __user)
         * parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
         * ->update_time() work from Josef.
         * other bits and pieces all over the place.
      
        Normally it would've been in two or three pull requests, but
        signal.git stuff had eaten a lot of time during this cycle ;-/"
      
      Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
      'truncate_range' inode method was removed by the VM changes, the VFS
      update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
      to sparse fix added twice, with other changes nearby).
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
        nfs: don't open in ->d_revalidate
        vfs: retry last component if opening stale dentry
        vfs: nameidata_to_filp(): don't throw away file on error
        vfs: nameidata_to_filp(): inline __dentry_open()
        vfs: do_dentry_open(): don't put filp
        vfs: split __dentry_open()
        vfs: do_last() common post lookup
        vfs: do_last(): add audit_inode before open
        vfs: do_last(): only return EISDIR for O_CREAT
        vfs: do_last(): check LOOKUP_DIRECTORY
        vfs: do_last(): make ENOENT exit RCU safe
        vfs: make follow_link check RCU safe
        vfs: do_last(): use inode variable
        vfs: do_last(): inline walk_component()
        vfs: do_last(): make exit RCU safe
        vfs: split do_lookup()
        Btrfs: move over to use ->update_time
        fs: introduce inode operation ->update_time
        reiserfs: get rid of resierfs_sync_super
        reiserfs: mark the superblock as dirty a bit later
        ...
      1193755a
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 4edebed8
      Linus Torvalds authored
      Pull Ext4 updates from Theodore Ts'o:
       "The major new feature added in this update is Darrick J Wong's
        metadata checksum feature, which adds crc32 checksums to ext4's
        metadata fields.
      
        There is also the usual set of cleanups and bug fixes."
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (44 commits)
        ext4: hole-punch use truncate_pagecache_range
        jbd2: use kmem_cache_zalloc wrapper instead of flag
        ext4: remove mb_groups before tearing down the buddy_cache
        ext4: add ext4_mb_unload_buddy in the error path
        ext4: don't trash state flags in EXT4_IOC_SETFLAGS
        ext4: let getattr report the right blocks in delalloc+bigalloc
        ext4: add missing save_error_info() to ext4_error()
        ext4: add debugging trigger for ext4_error()
        ext4: protect group inode free counting with group lock
        ext4: use consistent ssize_t type in ext4_file_write()
        ext4: fix format flag in ext4_ext_binsearch_idx()
        ext4: cleanup in ext4_discard_allocated_blocks()
        ext4: return ENOMEM when mounts fail due to lack of memory
        ext4: remove redundundant "(char *) bh->b_data" casts
        ext4: disallow hard-linked directory in ext4_lookup
        ext4: fix potential integer overflow in alloc_flex_gd()
        ext4: remove needs_recovery in ext4_mb_init()
        ext4: force ro mount if ext4_setup_super() fails
        ext4: fix potential NULL dereference in ext4_free_inodes_counts()
        ext4/jbd2: add metadata checksumming to the list of supported features
        ...
      4edebed8
    • Miklos Szeredi's avatar
      nfs: don't open in ->d_revalidate · 0ef97dcf
      Miklos Szeredi authored
      NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
      mount needs to be followed or not.  It does check d_mountpoint() on the dentry,
      which can result in a weird error if the VFS found that the mount does not in
      fact need to be followed, e.g.:
      
        # mount --bind /mnt/nfs /mnt/nfs-clone
        # echo something > /mnt/nfs/tmp/bar
        # echo x > /tmp/file
        # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
        # cat  /mnt/nfs/tmp/bar
        cat: /mnt/nfs/tmp/bar: Not a directory
      
      Which should, by any sane filesystem, result in "something" being printed.
      
      So instead do the open in f_op->open() and in the unlikely case that the cached
      dentry turned out to be invalid, drop the dentry and return EOPENSTALE to let
      the VFS retry.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      CC: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      0ef97dcf
    • Miklos Szeredi's avatar
      vfs: retry last component if opening stale dentry · 16b1c1cd
      Miklos Szeredi authored
      NFS optimizes away d_revalidates for last component of open.  This means that
      open itself can find the dentry stale.
      
      This patch allows the filesystem to return EOPENSTALE and the VFS will retry the
      lookup on just the last component if possible.
      
      If the lookup was done using RCU mode, including the last component, then this
      is not possible since the parent dentry is lost.  In this case fall back to
      non-RCU lookup.  Currently this is not used since NFS will always leave RCU
      mode.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      16b1c1cd
    • Miklos Szeredi's avatar
      vfs: nameidata_to_filp(): don't throw away file on error · 50ee93af
      Miklos Szeredi authored
      If open fails, don't put the file.  This allows it to be reused if open needs to
      be retried.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      50ee93af
    • Miklos Szeredi's avatar
      vfs: nameidata_to_filp(): inline __dentry_open() · 91daee98
      Miklos Szeredi authored
      Copy __dentry_open() into nameidata_to_filp().
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      91daee98
    • Miklos Szeredi's avatar
      vfs: do_dentry_open(): don't put filp · 78f71eff
      Miklos Szeredi authored
      Move put_filp() out to __dentry_open(), the only caller now.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      78f71eff
    • Miklos Szeredi's avatar
      vfs: split __dentry_open() · 90ad1a8e
      Miklos Szeredi authored
      Split __dentry_open() into two functions:
      
        do_dentry_open() - does most of the actual work, doesn't put file on failure
        open_check_o_direct() - after a successful open, checks direct_IO method
      
      This will allow i_op->atomic_open to do just the file initialization and leave
      the direct_IO checking to the VFS.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      90ad1a8e
    • Miklos Szeredi's avatar
      vfs: do_last() common post lookup · 5f5daac1
      Miklos Szeredi authored
      Now the post lookup code can be shared between O_CREAT and plain opens since
      they are essentially the same.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5f5daac1
    • Miklos Szeredi's avatar
      vfs: do_last(): add audit_inode before open · d7fdd7f6
      Miklos Szeredi authored
      This allows this code to be shared between O_CREAT and plain opens.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d7fdd7f6
    • Miklos Szeredi's avatar
      vfs: do_last(): only return EISDIR for O_CREAT · 050ac841
      Miklos Szeredi authored
      This allows this code to be shared between O_CREAT and plain opens.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      050ac841
    • Miklos Szeredi's avatar
      vfs: do_last(): check LOOKUP_DIRECTORY · af2f5542
      Miklos Szeredi authored
      Check for ENOTDIR before finishing open.  This allows this code to be shared
      between O_CREAT and plain opens.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      af2f5542
    • Miklos Szeredi's avatar
      vfs: do_last(): make ENOENT exit RCU safe · 54c33e7f
      Miklos Szeredi authored
      This will allow this code to be used in RCU mode.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      54c33e7f
    • Miklos Szeredi's avatar
      vfs: make follow_link check RCU safe · d45ea867
      Miklos Szeredi authored
      This will allow this code to be used in RCU mode.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d45ea867
    • Miklos Szeredi's avatar
      vfs: do_last(): use inode variable · decf3400
      Miklos Szeredi authored
      Use helper variable instead of path->dentry->d_inode before complete_walk().
      This will allow this code to be used in RCU mode.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      decf3400
    • Miklos Szeredi's avatar
      vfs: do_last(): inline walk_component() · a1eb3315
      Miklos Szeredi authored
      Copy walk_component() into do_lookup().
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a1eb3315
    • Miklos Szeredi's avatar
      vfs: do_last(): make exit RCU safe · e276ae67
      Miklos Szeredi authored
      Allow returning from do_last() with LOOKUP_RCU still set on the "out:" and
      "exit:" labels.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e276ae67
    • Miklos Szeredi's avatar
      vfs: split do_lookup() · 697f514d
      Miklos Szeredi authored
      Split do_lookup() into two functions:
      
        lookup_fast() - does cached lookup without i_mutex
        lookup_slow() - does lookup with i_mutex
      
      Both follow managed dentries.
      
      The new functions are needed by atomic_open.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      697f514d
    • Josef Bacik's avatar
      Btrfs: move over to use ->update_time · e41f941a
      Josef Bacik authored
      Btrfs had been doing it's own file_update_time so we could catch ENOSPC
      properly, so just update our btrfs_update_time to work with the new stuff and
      then we'll be fancy later.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      e41f941a
    • Josef Bacik's avatar
      fs: introduce inode operation ->update_time · c3b2da31
      Josef Bacik authored
      Btrfs has to make sure we have space to allocate new blocks in order to modify
      the inode, so updating time can fail.  We've gotten around this by having our
      own file_update_time but this is kind of a pain, and Christoph has indicated he
      would like to make xfs do something different with atime updates.  So introduce
      ->update_time, where we will deal with i_version an a/m/c time updates and
      indicate which changes need to be made.  The normal version just does what it
      has always done, updates the time and marks the inode dirty, and then
      filesystems can choose to do something different.
      
      I've gone through all of the users of file_update_time and made them check for
      errors with the exception of the fault code since it's complicated and I wasn't
      quite sure what to do there, also Jan is going to be pushing the file time
      updates into page_mkwrite for those who have it so that should satisfy btrfs and
      make it not a big deal to check the file_update_time() return code in the
      generic fault path. Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      c3b2da31
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 51eab603
      Linus Torvalds authored
      Pull btrfs updates from Chris Mason:
       "This includes a fairly large change from Josef around data writeback
        completion.  Before, the writeback wasn't completed until the metadata
        insertions for the extent were done, and this made for fairly large
        latency spikes on the last page of each ordered extent.
      
        We already had a separate mechanism for tracking pending metadata
        insertions, so Josef just needed to tweak things a little to end
        writeback earlier on the page.  Overall it makes us much friendly to
        memory reclaim and lowers latencies quite a lot for synchronous IO.
      
        Jan Schmidt has finished some background work required to track btree
        blocks as they go through changes in ownership.  It's the missing
        piece he needed for both btrfs send/receive and subvolume quotas.
        Neither of those are ready yet, but the new tracking code is included
        here.  Most of the time, the new code is off.  It is only used by
        scrub and other backref walkers.
      
        Stefan Behrens has added io failure tracking.  This includes counters
        for which drives are causing the most trouble so the admin (or an
        automated tool) can choose to kick them out.  We're tracking IO
        errors, crc errors, and generation checks we do on each metadata
        block.
      
        RAID5/6 did miss the cut this time because I'm having trouble with
        corruptions.  I'll nail it down next week and post as a beta testing
        before 3.6"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (58 commits)
        Btrfs: fix tree mod log rewinded level and rewinding of moved keys
        Btrfs: fix tree mod log del_ptr
        Btrfs: add tree_mod_dont_log helper
        Btrfs: add missing spin_lock for insertion into tree mod log
        Btrfs: add inodes before dropping the extent lock in find_all_leafs
        Btrfs: use delayed ref sequence numbers for all fs-tree updates
        Btrfs: fix false positive in check-integrity on unmount
        Btrfs: fix runtime warning in check-integrity check data mode
        Btrfs: set ioprio of scrub readahead to idle
        Btrfs: fix return code in drop_objectid_items
        Btrfs: check to see if the inode is in the log before fsyncing
        Btrfs: return value of btrfs_read_buffer is checked correctly
        Btrfs: read device stats on mount, write modified ones during commit
        Btrfs: add ioctl to get and reset the device stats
        Btrfs: add device counters for detected IO and checksum errors
        btrfs: Drop unused function btrfs_abort_devices()
        Btrfs: fix the same inode id problem when doing auto defragment
        Btrfs: fall back to non-inline if we don't have enough space
        Btrfs: fix how we deal with the orphan block rsv
        Btrfs: convert the inode bit field to use the actual bit operations
        ...
      51eab603
    • Linus Torvalds's avatar
      Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linux · 419f4319
      Linus Torvalds authored
      Pull the rest of the nfsd commits from Bruce Fields:
       "... and then I cherry-picked the remainder of the patches from the
        head of my previous branch"
      
      This is the rest of the original nfsd branch, rebased without the
      delegation stuff that I thought really needed to be redone.
      
      I don't like rebasing things like this in general, but in this situation
      this was the lesser of two evils.
      
      * 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits)
        nfsd4: fix, consolidate client_has_state
        nfsd4: don't remove rebooted client record until confirmation
        nfsd4: remove some dprintk's and a comment
        nfsd4: return "real" sequence id in confirmed case
        nfsd4: fix exchange_id to return confirm flag
        nfsd4: clarify that renewing expired client is a bug
        nfsd4: simpler ordering of setclientid_confirm checks
        nfsd4: setclientid: remove pointless assignment
        nfsd4: fix error return in non-matching-creds case
        nfsd4: fix setclientid_confirm same_cred check
        nfsd4: merge 3 setclientid cases to 2
        nfsd4: pull out common code from setclientid cases
        nfsd4: merge last two setclientid cases
        nfsd4: setclientid/confirm comment cleanup
        nfsd4: setclientid remove unnecessary terms from a logical expression
        nfsd4: move rq_flavor into svc_cred
        nfsd4: stricter cred comparison for setclientid/exchange_id
        nfsd4: move principal name into svc_cred
        nfsd4: allow removing clients not holding state
        nfsd4: rearrange exchange_id logic to simplify
        ...
      419f4319
    • Artem Bityutskiy's avatar
      reiserfs: get rid of resierfs_sync_super · 033369d1
      Artem Bityutskiy authored
      This patch stops reiserfs using the VFS 'write_super()' method along with the
      s_dirt flag, because they are on their way out.
      
      The whole "superblock write-out" VFS infrastructure is served by the
      'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
      writes out all dirty superblock using the '->write_super()' call-back.  But the
      problem with this thread is that it wastes power by waking up the system every
      5 seconds, even if there are no diry superblocks, or there are no client
      file-systems which would need this (e.g., btrfs does not use
      '->write_super()'). So we want to kill it completely and thus, we need to make
      file-systems to stop using the '->write_super()' VFS service, and then remove
      it together with the kernel thread.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      033369d1
    • Artem Bityutskiy's avatar
      reiserfs: mark the superblock as dirty a bit later · 5c5fd819
      Artem Bityutskiy authored
      The 'journal_mark_dirty()' function currently first marks the superblock as
      dirty by setting 's_dirt' to 1, then does various sanity checks and returns,
      then actuall does all the magic with the journal.
      
      This is not an ideal order, though. It makes more sense to first do all the
      checks, then do all the internal stuff, and at the end notify the VFS that the
      superblock is now dirty.
      
      This patch moves the 's_dirt = 1' assignment from the very beginning of this
      function to the very end.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5c5fd819
    • Artem Bityutskiy's avatar
      reiserfs: remove useless superblock dirtying · 717f03c4
      Artem Bityutskiy authored
      The 'reiserfs_resize()' function marks the superblock as dirty by assigning 1
      to 's_dirt' and then calls 'journal_mark_dirty()' which does the same. Thus,
      we can remove the assignment from 'reiserfs_resize()'.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      717f03c4
    • Artem Bityutskiy's avatar
      reiserfs: clean-up function return type · 25729b0e
      Artem Bityutskiy authored
      Turn 'reiserfs_flush_old_commits()' into a void function because the callers
      do not cares about what it returns anyway.
      
      We are going to remove the 'sb->s_dirt' field completely and this patch is a
      small step towards this direction.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      25729b0e
    • Artem Bityutskiy's avatar
      reiserfs: cleanup reiserfs_fill_super a bit · efaa33eb
      Artem Bityutskiy authored
      We have the reiserfs superblock pointer in the 'sbi' variable in this
      function, no need to use the 'REISERFS_SB(s)' macro which is the same.
      This is jut a small clean-up.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      efaa33eb
    • Al Viro's avatar
      sch_atm.c: get rid of poinless extern · d5836751
      Al Viro authored
      sockfd_lookup() is declared in linux/net.h, which is pulled by
      linux/skbuff.h (and needed for a lot of other stuff in sch_atm.c
      anyway).
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d5836751
    • Al Viro's avatar
      unexport do_munmap() · 17d1587f
      Al Viro authored
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      17d1587f
    • Al Viro's avatar
      new helper: vm_mmap_pgoff() · eb36c587
      Al Viro authored
      take it to mm/util.c, convert vm_mmap() to use of that one and
      take it to mm/util.c as well, convert both sys_mmap_pgoff() to
      use of vm_mmap_pgoff()
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      eb36c587
    • Al Viro's avatar
      kill do_mmap() completely · dc982501
      Al Viro authored
      just pull into vm_mmap()
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      dc982501
    • Al Viro's avatar
      switch aio and shm to do_mmap_pgoff(), make do_mmap() static · e3fc629d
      Al Viro authored
      after all, 0 bytes and 0 pages is the same thing...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e3fc629d
    • Al Viro's avatar
    • Al Viro's avatar
      move security_mmap_addr() to saner place · 9ac4ed4b
      Al Viro authored
      it really should be done by get_unmapped_area(); that cuts down on
      the amount of callers considerably and it's the right place for
      that stuff anyway.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      9ac4ed4b
    • Al Viro's avatar
      take security_mmap_file() outside of ->mmap_sem · 8b3ec681
      Al Viro authored
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8b3ec681
    • Hugh Dickins's avatar
      ext4: hole-punch use truncate_pagecache_range · 5e44f8c3
      Hugh Dickins authored
      When truncating a file, we unmap pages from userspace first, as that's
      usually more efficient than relying, page by page, on the fallback in
      truncate_inode_page() - particularly if the file is mapped many times.
      
      Do the same when punching a hole: 3.4 added truncate_pagecache_range()
      to do the unmap and trunc, so use it in ext4_ext_punch_hole(), instead
      of calling truncate_inode_pages_range() directly.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      5e44f8c3
    • Wanlong Gao's avatar
      jbd2: use kmem_cache_zalloc wrapper instead of flag · b2f4edb3
      Wanlong Gao authored
      Use kmem_cache_zalloc wrapper instead of flag __GFP_ZERO.
      Signed-off-by: default avatarWanlong Gao <gaowanlong@cn.fujitsu.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      b2f4edb3
    • Salman Qazi's avatar
      ext4: remove mb_groups before tearing down the buddy_cache · 95599968
      Salman Qazi authored
      We can't have references held on pages in the s_buddy_cache while we are
      trying to truncate its pages and put the inode.  All the pages must be
      gone before we reach clear_inode.  This can only be gauranteed if we
      can prevent new users from grabbing references to s_buddy_cache's pages.
      
      The original bug can be reproduced and the bug fix can be verified by:
      
      while true; do mount -t ext4 /dev/ram0 /export/hda3/ram0; \
      	umount /export/hda3/ram0; done &
      
      while true; do cat /proc/fs/ext4/ram0/mb_groups; done
      Signed-off-by: default avatarSalman Qazi <sqazi@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      95599968
    • Salman Qazi's avatar
      ext4: add ext4_mb_unload_buddy in the error path · 02b78310
      Salman Qazi authored
      ext4_free_blocks fails to pair an ext4_mb_load_buddy with a matching
      ext4_mb_unload_buddy when it fails a memory allocation.
      Signed-off-by: default avatarSalman Qazi <sqazi@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      02b78310
    • Theodore Ts'o's avatar
      ext4: don't trash state flags in EXT4_IOC_SETFLAGS · 79906964
      Theodore Ts'o authored
      In commit 353eb83c we removed i_state_flags with 64-bit longs, But
      when handling the EXT4_IOC_SETFLAGS ioctl, we replace i_flags
      directly, which trashes the state flags which are stored in the high
      32-bits of i_flags on 64-bit platforms.  So use the the
      ext4_{set,clear}_inode_flags() functions which use atomic bit
      manipulation functions instead.
      Reported-by: default avatarTao Ma <boyu.mt@taobao.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      79906964