1. 10 Sep, 2020 11 commits
    • Chao Yu's avatar
      f2fs: compress: use more readable atomic_t type for {cic,dic}.ref · e6c3948d
      Chao Yu authored
      refcount_t type variable should never be less than one, so it's a
      little bit hard to understand when we use it to indicate pending
      compressed page count, let's change to use atomic_t for better
      readability.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e6c3948d
    • Chao Yu's avatar
      f2fs: fix compile warning · 17d7648d
      Chao Yu authored
      This patch fixes below compile warning reported by LKP
      (kernel test robot)
      
      cppcheck warnings: (new ones prefixed by >>)
      
      >> fs/f2fs/file.c:761:9: warning: Identical condition 'err', second condition is always false [identicalConditionAfterEarlyExit]
          return err;
                 ^
         fs/f2fs/file.c:753:6: note: first condition
          if (err)
              ^
         fs/f2fs/file.c:761:9: note: second condition
          return err;
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      17d7648d
    • Chao Yu's avatar
      f2fs: support 64-bits key in f2fs rb-tree node entry · 2e9b2bb2
      Chao Yu authored
      then, we can add specified entry into rb-tree with 64-bits segment time
      as key.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      2e9b2bb2
    • Chao Yu's avatar
      f2fs: inherit mtime of original block during GC · c5d02785
      Chao Yu authored
      Don't let f2fs inner GC ruins original aging degree of segment.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c5d02785
    • Chao Yu's avatar
      f2fs: record average update time of segment · 6f3a01ae
      Chao Yu authored
      Previously, once we update one block in segment, we will update mtime of
      segment to last time, making aged segment becoming freshest, result in
      that GC with cost benefit algorithm missing such segment, So this patch
      changes to record mtime as average block updating time instead of last
      updating time.
      
      It's not needed to reset mtime for prefree segment, as se->valid_blocks
      is zero, then old se->mtime won't take any weight with below calculation:
      
      	se->mtime = div_u64(se->mtime * se->valid_blocks + mtime,
      					se->valid_blocks + 1);
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6f3a01ae
    • Chao Yu's avatar
      f2fs: introduce inmem curseg · d0b9e42a
      Chao Yu authored
      Previous implementation of aligned pinfile allocation will:
      - allocate new segment on cold data log no matter whether last used
      segment is partially used or not, it makes IOs more random;
      - force concurrent cold data/GCed IO going into warm data area, it
      can make a bad effect on hot/cold data separation;
      
      In this patch, we introduce a new type of log named 'inmem curseg',
      the differents from normal curseg is:
      - it reuses existed segment type (CURSEG_XXX_NODE/DATA);
      - it only exists in memory, its segno, blkofs, summary will not b
       persisted into checkpoint area;
      
      With this new feature, we can enhance scalability of log, special
      allocators can be created for purposes:
      - pure lfs allocator for aligned pinfile allocation or file
      defragmentation
      - pure ssr allocator for later feature
      
      So that, let's update aligned pinfile allocation to use this new
      inmem curseg fwk.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d0b9e42a
    • Chao Yu's avatar
      f2fs: compress: remove unneeded code · 376207af
      Chao Yu authored
      - f2fs_write_multi_pages
       - f2fs_compress_pages
        - init_compress_ctx
        - compress_pages
        - destroy_compress_ctx  --- 1
       - f2fs_write_compressed_pages
       - destroy_compress_ctx  --- 2
      
      destroy_compress_ctx() in f2fs_write_multi_pages() is redundant, remove
      it.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      376207af
    • Xiaojun Wang's avatar
      f2fs: remove duplicated type casting · e90027d2
      Xiaojun Wang authored
      Since DUMMY_WRITTEN_PAGE and ATOMIC_WRITTEN_PAGE have already been
      converted as unsigned long type, we don't need do type casting again.
      Signed-off-by: default avatarXiaojun Wang <wangxiaojun11@huawei.com>
      Reported-by: default avatarJack Qiu <jack.qiu@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e90027d2
    • Aravind Ramesh's avatar
      f2fs: support zone capacity less than zone size · de881df9
      Aravind Ramesh authored
      NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
      Zone-capacity indicates the maximum number of sectors that are usable in
      a zone beginning from the first sector of the zone. This makes the sectors
      sectors after the zone-capacity till zone-size to be unusable.
      This patch set tracks zone-size and zone-capacity in zoned devices and
      calculate the usable blocks per segment and usable segments per section.
      
      If zone-capacity is less than zone-size mark only those segments which
      start before zone-capacity as free segments. All segments at and beyond
      zone-capacity are treated as permanently used segments. In cases where
      zone-capacity does not align with segment size the last segment will start
      before zone-capacity and end beyond the zone-capacity of the zone. For
      such spanning segments only sectors within the zone-capacity are used.
      
      During writes and GC manage the usable segments in a section and usable
      blocks per segment. Segments which are beyond zone-capacity are never
      allocated, and do not need to be garbage collected, only the segments
      which are before zone-capacity needs to garbage collected.
      For spanning segments based on the number of usable blocks in that
      segment, write to blocks only up to zone-capacity.
      
      Zone-capacity is device specific and cannot be configured by the user.
      Since NVMe ZNS device zones are sequentially write only, a block device
      with conventional zones or any normal block device is needed along with
      the ZNS device for the metadata operations of F2fs.
      
      A typical nvme-cli output of a zoned device shows zone start and capacity
      and write pointer as below:
      
      SLBA: 0x0     WP: 0x0     Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
      
      Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
      are in EMPTY state. For each zone, only zone start + 49MB is usable area,
      any lba/sector after 49MB cannot be read or written to, the drive will fail
      any attempts to read/write. So, the second zone starts at 64MB and is
      usable till 113MB (64 + 49) and the range between 113 and 128MB is
      again unusable. The next zone starts at 128MB, and so on.
      Signed-off-by: default avatarAravind Ramesh <aravind.ramesh@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      de881df9
    • Linus Torvalds's avatar
      Merge tag 'f2fs-for-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 581cb3a2
      Linus Torvalds authored
      Pull f2fs fixes from Jaegeuk Kim:
       "Small bug fixes for:
      
         - SMR drive fix
      
         - infinite loop when building free node ids
      
         - EOF at DIO read"
      
      * tag 'f2fs-for-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
        f2fs: Return EOF on unaligned end of file DIO read
        f2fs: fix indefinite loop scanning for free nid
        f2fs: Fix type of section block count variables
      581cb3a2
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 7fe10096
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "This fixes a regression in padata"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        padata: fix possible padata_works_lock deadlock
      7fe10096
  2. 09 Sep, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · ab29a807
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Fix an NFS/RDMA resource leak
      
       - Fix the error handling during delegation recall
      
       - NFSv4.0 needs to return the delegation on a zero-stateid SETATTR
      
       - Stop printk reading past end of string
      
      * tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: stop printk reading past end of string
        NFS: Zero-stateid SETATTR should first return delegation
        NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall
        xprtrdma: Release in-flight MRs on disconnect
      ab29a807
    • Gabriel Krisman Bertazi's avatar
      f2fs: Return EOF on unaligned end of file DIO read · 20d0a107
      Gabriel Krisman Bertazi authored
      Reading past end of file returns EOF for aligned reads but -EINVAL for
      unaligned reads on f2fs.  While documentation is not strict about this
      corner case, most filesystem returns EOF on this case, like iomap
      filesystems.  This patch consolidates the behavior for f2fs, by making
      it return EOF(0).
      
      it can be verified by a read loop on a file that does a partial read
      before EOF (A file that doesn't end at an aligned address).  The
      following code fails on an unaligned file on f2fs, but not on
      btrfs, ext4, and xfs.
      
        while (done < total) {
          ssize_t delta = pread(fd, buf + done, total - done, off + done);
          if (!delta)
            break;
          ...
        }
      
      It is arguable whether filesystems should actually return EOF or
      -EINVAL, but since iomap filesystems support it, and so does the
      original DIO code, it seems reasonable to consolidate on that.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      20d0a107
    • Sahitya Tummala's avatar
      f2fs: fix indefinite loop scanning for free nid · e2cab031
      Sahitya Tummala authored
      If the sbi->ckpt->next_free_nid is not NAT block aligned and if there
      are free nids in that NAT block between the start of the block and
      next_free_nid, then those free nids will not be scanned in scan_nat_page().
      This results into mismatch between nm_i->available_nids and the sum of
      nm_i->free_nid_count of all NAT blocks scanned. And nm_i->available_nids
      will always be greater than the sum of free nids in all the blocks.
      Under this condition, if we use all the currently scanned free nids,
      then it will loop forever in f2fs_alloc_nid() as nm_i->available_nids
      is still not zero but nm_i->free_nid_count of that partially scanned
      NAT block is zero.
      
      Fix this to align the nm_i->next_scan_nid to the first nid of the
      corresponding NAT block.
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e2cab031
    • Shin'ichiro Kawasaki's avatar
      f2fs: Fix type of section block count variables · 123aaf77
      Shin'ichiro Kawasaki authored
      Commit da52f8ad ("f2fs: get the right gc victim section when section
      has several segments") added code to count blocks of each section using
      variables with type 'unsigned short', which has 2 bytes size in many
      systems. However, the counts can be larger than the 2 bytes range and
      type conversion results in wrong values. Especially when the f2fs
      sections have blocks as many as USHRT_MAX + 1, the count is handled as 0.
      This triggers eternal loop in init_dirty_segmap() at mount system call.
      Fix this by changing the type of the variables to block_t.
      
      Fixes: da52f8ad ("f2fs: get the right gc victim section when section has several segments")
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      123aaf77
  3. 08 Sep, 2020 8 commits
  4. 07 Sep, 2020 1 commit
  5. 06 Sep, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block · a8205e31
      Linus Torvalds authored
      Pull more io_uring fixes from Jens Axboe:
       "Two followup fixes. One is fixing a regression from this merge window,
        the other is two commits fixing cancelation of deferred requests.
      
        Both have gone through full testing, and both spawned a few new
        regression test additions to liburing.
      
         - Don't play games with const, properly store the output iovec and
           assign it as needed.
      
         - Deferred request cancelation fix (Pavel)"
      
      * tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
        io_uring: fix linked deferred ->files cancellation
        io_uring: fix cancel of deferred reqs with ->files
        io_uring: fix explicit async read/write mapping for large segments
      a8205e31
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 2ccdd9f8
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - three Intel VT-d fixes to fix address handling on 32bit, fix a NULL
         pointer dereference bug and serialize a hardware register access as
         required by the VT-d spec.
      
       - two patches for AMD IOMMU to force AMD GPUs into translation mode
         when memory encryption is active and disallow using IOMMUv2
         functionality.  This makes the AMDGPU driver work when memory
         encryption is active.
      
       - two more fixes for AMD IOMMU to fix updating the Interrupt Remapping
         Table Entries.
      
       - MAINTAINERS file update for the Qualcom IOMMU driver.
      
      * tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Handle 36bit addressing for x86-32
        iommu/amd: Do not use IOMMUv2 functionality when SME is active
        iommu/amd: Do not force direct mapping when SME is active
        iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
        iommu/amd: Restore IRTE.RemapEn bit after programming IRTE
        iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
        iommu/vt-d: Serialize IOMMU GCMD register modifications
        MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
      2ccdd9f8
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 015b3155
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - more generic entry code ABI fallout
      
       - debug register handling bugfixes
      
       - fix vmalloc mappings on 32-bit kernels
      
       - kprobes instrumentation output fix on 32-bit kernels
      
       - fix over-eager WARN_ON_ONCE() on !SMAP hardware
      
       - NUMA debugging fix
      
       - fix Clang related crash on !RETPOLINE kernels
      
      * tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry: Unbreak 32bit fast syscall
        x86/debug: Allow a single level of #DB recursion
        x86/entry: Fix AC assertion
        tracing/kprobes, x86/ptrace: Fix regs argument order for i386
        x86, fakenuma: Fix invalid starting node ID
        x86/mm/32: Bring back vmalloc faulting on x86_32
        x86/cmdline: Disable jump tables for cmdline.c
      015b3155
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 68beef57
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
       "A small series for fixing a problem with Xen PVH guests when running
        as backends (e.g. as dom0).
      
        Mapping other guests' memory is now working via ZONE_DEVICE, thus not
        requiring to abuse the memory hotplug functionality for that purpose"
      
      * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: add helpers to allocate unpopulated memory
        memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC
        xen/balloon: add header guard
      68beef57
  6. 05 Sep, 2020 12 commits