1. 25 Feb, 2022 18 commits
    • Linus Torvalds's avatar
      Merge tag 'configfs-5.17-2022-02-25' of git://git.infradead.org/users/hch/configfs · 9137eda5
      Linus Torvalds authored
      Pull configfs fix from Christoph Hellwig:
      
       - fix a race in configfs_{,un}register_subsystem (ChenXiaoSong)
      
      * tag 'configfs-5.17-2022-02-25' of git://git.infradead.org/users/hch/configfs:
        configfs: fix a race in configfs_{,un}register_subsystem()
      9137eda5
    • Linus Torvalds's avatar
      Merge tag 'for-5.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · c0419188
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "This is a hopefully last batch of fixes for defrag that got broken in
        5.16, all stable material.
      
        The remaining reported problem is excessive IO with autodefrag due to
        various conditions in the defrag code not met or missing"
      
      * tag 'for-5.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: reduce extent threshold for autodefrag
        btrfs: autodefrag: only scan one inode once
        btrfs: defrag: don't use merged extent map for their generation check
        btrfs: defrag: bring back the old file extent search behavior
        btrfs: defrag: remove an ambiguous condition for rejection
        btrfs: defrag: don't defrag extents which are already at max capacity
        btrfs: defrag: don't try to merge regular extents with preallocated extents
        btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target
        btrfs: prevent copying too big compressed lzo segment
      c0419188
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · ca745723
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
      
       - Older "does not even boot" regression in qib from July
      
       - Bug fixes for error unwind in rtrs
      
       - Avoid a deadlock syzkaller found in srp
      
       - Fix another UAF syzkaller found in cma
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/cma: Do not change route.addr.src_addr outside state checks
        RDMA/ib_srp: Fix a deadlock
        RDMA/rtrs-clt: Move free_permit from free_clt to rtrs_clt_close
        RDMA/rtrs-clt: Fix possible double free in error case
        IB/qib: Fix duplicate sysfs directory name
      ca745723
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 115ccd22
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix an bug generating spurious interrupts in gpio-rockchip
      
       - fix a race condition in gpiod_to_irq() called by GPIO consumers
      
      * tag 'gpio-fixes-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: Return EPROBE_DEFER if gc->to_irq is NULL
        gpio: rockchip: Reset int_bothedge when changing trigger
      115ccd22
    • Jason Gunthorpe's avatar
      RDMA/cma: Do not change route.addr.src_addr outside state checks · 22e9f710
      Jason Gunthorpe authored
      If the state is not idle then resolve_prepare_src() should immediately
      fail and no change to global state should happen. However, it
      unconditionally overwrites the src_addr trying to build a temporary any
      address.
      
      For instance if the state is already RDMA_CM_LISTEN then this will corrupt
      the src_addr and would cause the test in cma_cancel_operation():
      
                 if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)
      
      Which would manifest as this trace from syzkaller:
      
        BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
        Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204
      
        CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:79 [inline]
         dump_stack+0x141/0x1d7 lib/dump_stack.c:120
         print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
         __kasan_report mm/kasan/report.c:399 [inline]
         kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
         __list_add_valid+0x93/0xa0 lib/list_debug.c:26
         __list_add include/linux/list.h:67 [inline]
         list_add_tail include/linux/list.h:100 [inline]
         cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline]
         rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751
         ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102
         ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732
         vfs_write+0x28e/0xa30 fs/read_write.c:603
         ksys_write+0x1ee/0x250 fs/read_write.c:658
         do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      This is indicating that an rdma_id_private was destroyed without doing
      cma_cancel_listens().
      
      Instead of trying to re-use the src_addr memory to indirectly create an
      any address derived from the dst build one explicitly on the stack and
      bind to that as any other normal flow would do. rdma_bind_addr() will copy
      it over the src_addr once it knows the state is valid.
      
      This is similar to commit bc0bdc5a ("RDMA/cma: Do not change
      route.addr.src_addr.ss_family")
      
      Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com
      Cc: stable@vger.kernel.org
      Fixes: 732d41c5 ("RDMA/cma: Make the locking for automatic state transition more clear")
      Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      22e9f710
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 4b23c6ec
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A few small driver specific fixes"
      
      * tag 'spi-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: rockchip: terminate dma transmission when slave abort
        spi: rockchip: Fix error in getting num-cs property
        spi: spi-zynq-qspi: Fix a NULL pointer dereference in zynq_qspi_exec_mem_op()
      4b23c6ec
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v5.17-rc5' of... · 64b5132b
      Linus Torvalds authored
      Merge tag 'regulator-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "A series of fixes for the da9121 driver"
      
      * tag 'regulator-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: da9121: Remove surplus DA9141 parameters
        regulator: da9121: Fix DA914x voltage value
        regulator: da9121: Fix DA914x current values
      64b5132b
    • Linus Torvalds's avatar
      Merge tag 'regmap-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 0e9894e6
      Linus Torvalds authored
      Pull regmap fix from Mark Brown:
       "A fix for interrupt controllers which require the explicit
        acknowledgement of interrupts using a different register to the one
        where interrupts are reported.
      
        Urgent for the few devices this affects"
      
      * tag 'regmap-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap-irq: Update interrupt clear register for proper reset
      0e9894e6
    • Linus Torvalds's avatar
      Merge tag 'thermal-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · e48cb5c2
      Linus Torvalds authored
      Pull thermal control fix from Rafael Wysocki:
       "Fix a memory leak in the int340x thermal driver's ACPI notify handler
        (Chuansheng Liu)"
      
      * tag 'thermal-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        thermal: int340x: fix memory leak in int3400_notify()
      e48cb5c2
    • Linus Torvalds's avatar
      Merge tag 'pm-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2800b6d0
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "Fix the throttle IRQ handling during cpufreq initialization on
        Qualcomm platforms (Bjorn Andersson)"
      
      * tag 'pm-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: qcom-hw: Delay enabling throttle_irq
        cpufreq: Reintroduce ready() callback
      2800b6d0
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · c4765831
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a few small driver fixes for 5.17-rc6 for reported issues.
      
        The majority of these are IIO fixes for small things, and the other
        two are a mvmem and mtd core conflict fix.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'char-misc-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        mtd: core: Fix a conflict between MTD and NVMEM on wp-gpios property
        nvmem: core: Fix a conflict between MTD and NVMEM on wp-gpios property
        iio: imu: st_lsm6dsx: wait for settling time in st_lsm6dsx_read_oneshot
        iio: Fix error handling for PM
        iio: addac: ad74413r: correct comparator gpio getters mask usage
        iio: addac: ad74413r: use ngpio size when iterating over mask
        iio: addac: ad74413r: Do not reference negative array offsets
        iio: adc: men_z188_adc: Fix a resource leak in an error handling path
        iio: frequency: admv1013: remove the always true condition
        iio: accel: fxls8962af: add padding to regmap for SPI
        iio:imu:adis16480: fix buffering for devices with no burst mode
        iio: adc: ad7124: fix mask used for setting AIN_BUFP & AIN_BUFM bits
        iio: adc: tsc2046: fix memory corruption by preventing array overflow
      c4765831
    • Linus Torvalds's avatar
      Merge tag 'driver-core-5.17-rc6' of... · d68ccfdb
      Linus Torvalds authored
      Merge tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "Here is a single driver core fix for 5.17-rc6. It resolves a reported
        problem when the DMA map of a device is not properly released.
      
        It has been in linux-next with no reported problems"
      
      * tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        driver core: Free DMA range map when device is released
      d68ccfdb
    • Linus Torvalds's avatar
      Merge tag 'staging-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · eae9350e
      Linus Torvalds authored
      Pull staging driver fix from Greg KH:
       "Here is a single staging driver fix for 5.17-rc6.
      
        It resolves a reported problem in the fbtft fb_st7789v.c driver that
        could cause the display to be flipped in cold weather.
      
        It has been in linux-next with no reported problems"
      
      * tag 'staging-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: fbtft: fb_st7789v: reset display before initialization
      eae9350e
    • Linus Torvalds's avatar
      Merge tag 'tty-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · d8fc3bb6
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are some small n_gsm and sc16is7xx serial driver fixes for
        5.17-rc6.
      
        The n_gsm fixes are from Siemens as it seems they are using the line
        discipline and fixing up a number of issues they found in their
        testing. The sc16is7xx serial driver fix is for a reported problem
        with that chip.
      
        All of these have been in linux-next with no reported problems"
      
      * tag 'tty-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        sc16is7xx: Fix for incorrect data being transmitted
        tty: n_gsm: fix deadlock in gsmtty_open()
        tty: n_gsm: fix wrong modem processing in convergence layer type 2
        tty: n_gsm: fix wrong tty control line for flow control
        tty: n_gsm: fix NULL pointer access due to DLCI release
        tty: n_gsm: fix proper link termination after failed open
        tty: n_gsm: fix encoding of command/response bit
        tty: n_gsm: fix encoding of control signal octet bit DV
      d8fc3bb6
    • Linus Torvalds's avatar
      Merge tag 'usb-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 548b1af4
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a number of small USB driver fixes for 5.17-rc6 to resolve
        reported problems and add new device ids. They include:
      
         - dwc3:
            - device mapping fix
            - new device ids
            - driver fixes
      
         - xhci driver fixes
      
         - gadget driver fixes
      
         - usb-serial driver device id updates
      
        All of these have been in linux-next with no reported problems"
      
      * tag 'usb-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: gadget: rndis: add spinlock for rndis response list
        usb: dwc3: gadget: Let the interrupt handler disable bottom halves.
        USB: gadget: validate endpoint index for xilinx udc
        USB: serial: option: add Telit LE910R1 compositions
        USB: serial: option: add support for DW5829e
        Revert "USB: serial: ch341: add new Product ID for CH341A"
        usb: dwc2: drd: fix soft connect when gadget is unconfigured
        usb: dwc3: pci: Fix Bay Trail phy GPIO mappings
        tps6598x: clear int mask on probe failure
        xhci: Prevent futile URB re-submissions due to incorrect return value.
        xhci: re-initialize the HC during resume if HCE was set
        usb: dwc3: pci: Add "snps,dis_u2_susphy_quirk" for Intel Bay Trail
        usb: dwc3: pci: add support for the Intel Raptor Lake-S
      548b1af4
    • Linus Torvalds's avatar
      Merge tag 'ata-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · 78081594
      Linus Torvalds authored
      Pull ata fixes from Damien Le Moal:
       "Two fixes for the pata_hpt37x driver, both from Sergey:
      
         - Fix a PCI register access using an incorrect size (8bits instead of
           16bits)
      
         - Make sure to always disable the primary channel as it is unused"
      
      * tag 'ata-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
        ata: pata_hpt37x: disable primary channel on HPT371
        ata: pata_hpt37x: fix PCI clock detection
      78081594
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 53ab78cd
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A couple driver fixes in the clk subsystem
      
         - Fix a hang due to bad clk parent in the Ingenic jz4725b driver
      
         - Fix SD controllers on Qualcomm MSM8994 SoCs by removing clks that
           shouldn't be touched"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: jz4725b: fix mmc0 clock gating
        clk: qcom: gcc-msm8994: Remove NoC clocks
      53ab78cd
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2022-02-25' of git://anongit.freedesktop.org/drm/drm · 5ee3d001
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular drm fixes pull, i915, amdgpu and tegra mostly, all pretty
        small.
      
        core:
         - edid: Always set RGB444
      
        tegra:
         - tegra186 suspend/resume fixes
         - syncpoint wait fix
         - build warning fix
         - eDP on older devices fix
      
        amdgpu:
         - Display FP fix
         - PCO powergating fix
         - RDNA2 OEM SKU stability fixes
         - Display PSR fix
         - PCI ASPM fix
         - Display link encoder fix for TEST_COMMIT
         - Raven2 suspend/resume fix
         - Fix a regression in virtual display support
         - GPUVM eviction fix
      
        i915:
         - Fix QGV handling on ADL-P+
         - Fix bw atomic check when switching between SAGV vs. no SAGV
         - Disconnect PHYs left connected by BIOS on disabled ports
         - Fix SAVG to no SAGV transitions on TGL+
         - Print PHY name properly on calibration error (DG2)
      
        imx:
         - dcss: Select GEM CMA helpers
      
        radeon:
         - Fix some variables's type
      
        vc4:
         - Fix codec cleanup
         - Fix PM reference counting"
      
      * tag 'drm-fixes-2022-02-25' of git://anongit.freedesktop.org/drm/drm: (24 commits)
        drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
        drm/amdgpu: bypass tiling flag check in virtual display case (v2)
        Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()"
        drm/amdgpu: do not enable asic reset for raven2
        drm/amd/display: Fix stream->link_enc unassigned during stream removal
        drm/amd: Check if ASPM is enabled from PCIe subsystem
        drm/edid: Always set RGB444
        drm/tegra: dpaux: Populate AUX bus
        drm/radeon: fix variable type
        drm/amd/display: For vblank_disable_immediate, check PSR is really used
        drm/amd/pm: fix some OEM SKU specific stability issues
        drm/amdgpu: disable MMHUB PG for Picasso
        drm/amd/display: Protect update_bw_bounding_box FPU code.
        drm/i915/dg2: Print PHY name properly on calibration error
        drm/i915: Fix bw atomic check when switching between SAGV vs. no SAGV
        drm/i915: Correctly populate use_sagv_wm for all pipes
        drm/i915: Disconnect PHYs left connected by BIOS on disabled ports
        drm/i915: Widen the QGV point mask
        drm/imx/dcss: i.MX8MQ DCSS select DRM_GEM_CMA_HELPER
        drm/vc4: crtc: Fix runtime_pm reference counting
        ...
      5ee3d001
  2. 24 Feb, 2022 22 commits
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v5.17-2022-02-24' of... · 7ee02256
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v5.17-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Fix double free in in the error path when opening perf.data from
         multiple files in a directory instead of from a single file
      
       - Sync the msr-index.h copy with the kernel sources
      
       - Fix error when printing 'weight' field in 'perf script'
      
       - Skip failing sigtrap test for arm+aarch64 in 'perf test'
      
       - Fix failure to use a cpu list for uncore events in hybrid systems,
         e.g. Intel Alder Lake
      
      * tag 'perf-tools-fixes-for-v5.17-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf script: Fix error when printing 'weight' field
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf data: Fix double free in perf_session__delete()
        perf evlist: Fix failed to use cpu list for uncore events
        perf test: Skip failing sigtrap test for arm+aarch64
      7ee02256
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 1f840c0e
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "x86 host:
      
         - Expose KVM_CAP_ENABLE_CAP since it is supported
      
         - Disable KVM_HC_CLOCK_PAIRING in TSC catchup mode
      
         - Ensure async page fault token is nonzero
      
         - Fix lockdep false negative
      
         - Fix FPU migration regression from the AMX changes
      
        x86 guest:
      
         - Don't use PV TLB/IPI/yield on uniprocessor guests
      
        PPC:
      
         - reserve capability id (topic branch for ppc/kvm)"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled
        KVM: x86/mmu: make apf token non-zero to fix bug
        KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3
        x86/kvm: Don't use pv tlb/ipi/sched_yield if on 1 vCPU
        x86/kvm: Fix compilation warning in non-x86_64 builds
        x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0
        x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0
        kvm: x86: Disable KVM_HC_CLOCK_PAIRING if tsc is in always catchup mode
        KVM: Fix lockdep false negative during host resume
        KVM: x86: Add KVM_CAP_ENABLE_CAP to x86
      1f840c0e
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · d8152cfe
      Linus Torvalds authored
      Pull pci fixes from Bjorn Helgaas:
      
       - Fix a merge error that broke PCI device enumeration on mvebu
         platforms, including Turris Omnia (Armada 385) (Pali Rohár)
      
       - Avoid using ATS on all AMD Navi10 and Navi14 GPUs because some
         VBIOSes don't account for "harvested" (disabled) parts of the chip
         when initializing caches (Alex Deucher)
      
      * tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken
        PCI: mvebu: Fix device enumeration regression
      d8152cfe
    • Linus Torvalds's avatar
      Merge tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · f672ff91
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bpf and netfilter.
      
        Current release - regressions:
      
         - bpf: fix crash due to out of bounds access into reg2btf_ids
      
         - mvpp2: always set port pcs ops, avoid null-deref
      
         - eth: marvell: fix driver load from initrd
      
         - eth: intel: revert "Fix reset bw limit when DCB enabled with 1 TC"
      
        Current release - new code bugs:
      
         - mptcp: fix race in overlapping signal events
      
        Previous releases - regressions:
      
         - xen-netback: revert hotplug-status changes causing devices to not
           be configured
      
         - dsa:
            - avoid call to __dev_set_promiscuity() while rtnl_mutex isn't
              held
            - fix panic when removing unoffloaded port from bridge
      
         - dsa: microchip: fix bridging with more than two member ports
      
        Previous releases - always broken:
      
         - bpf:
            - fix crash due to incorrect copy_map_value when both spin lock
              and timer are present in a single value
            - fix a bpf_timer initialization issue with clang
            - do not try bpf_msg_push_data with len 0
            - add schedule points in batch ops
      
         - nf_tables:
            - unregister flowtable hooks on netns exit
            - correct flow offload action array size
            - fix a couple of memory leaks
      
         - vsock: don't check owner in vhost_vsock_stop() while releasing
      
         - gso: do not skip outer ip header in case of ipip and net_failover
      
         - smc: use a mutex for locking "struct smc_pnettable"
      
         - openvswitch: fix setting ipv6 fields causing hw csum failure
      
         - mptcp: fix race in incoming ADD_ADDR option processing
      
         - sysfs: add check for netdevice being present to speed_show
      
         - sched: act_ct: fix flow table lookup after ct clear or switching
           zones
      
         - eth: intel: fixes for SR-IOV forwarding offloads
      
         - eth: broadcom: fixes for selftests and error recovery
      
         - eth: mellanox: flow steering and SR-IOV forwarding fixes
      
        Misc:
      
         - make __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor
           friends not report freed skbs as drops
      
         - force inlining of checksum functions in net/checksum.h"
      
      * tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
        net: mv643xx_eth: process retval from of_get_mac_address
        ping: remove pr_err from ping_lookup
        Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"
        openvswitch: Fix setting ipv6 fields causing hw csum failure
        ipv6: prevent a possible race condition with lifetimes
        net/smc: Use a mutex for locking "struct smc_pnettable"
        bnx2x: fix driver load from initrd
        Revert "xen-netback: Check for hotplug-status existence before watching"
        Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
        net/mlx5e: Fix VF min/max rate parameters interchange mistake
        net/mlx5e: Add missing increment of count
        net/mlx5e: MPLSoUDP decap, fix check for unsupported matches
        net/mlx5e: Fix MPLSoUDP encap to use MPLS action information
        net/mlx5e: Add feature check for set fec counters
        net/mlx5e: TC, Skip redundant ct clear actions
        net/mlx5e: TC, Reject rules with forward and drop actions
        net/mlx5e: TC, Reject rules with drop and modify hdr action
        net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets
        net/mlx5e: Fix wrong return value on ioctl EEPROM query failure
        net/mlx5: Fix possible deadlock on rule deletion
        ...
      f672ff91
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2022-02-24' of... · ecf8a99f
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2022-02-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      - Fix QGV handling on ADL-P+ (Ville Syrjälä)
      - Fix bw atomic check when switching between SAGV vs. no SAGV (Ville Syrjälä)
      - Disconnect PHYs left connected by BIOS on disabled ports (Imre Deak)
      - Fix SAVG to no SAGV transitions on TGL+ (Ville Syrjälä)
      - Print PHY name properly on calibration error (DG2) (Matt Roper)
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/YhdyHwRWkOTWwlqi@tursulin-mobl2
      ecf8a99f
    • Linus Torvalds's avatar
      Merge tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block · 73878e5e
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request:
          - send H2CData PDUs based on MAXH2CDATA (Varun Prakash)
          - fix passthrough to namespaces with unsupported features (Christoph
            Hellwig)
      
       - Clear iocb->private at poll completion (Stefano)
      
      * tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block:
        nvme-tcp: send H2CData PDUs based on MAXH2CDATA
        nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info
        nvme: don't return an error from nvme_configure_metadata
        block: clear iocb->private in blkdev_bio_end_io_async()
      73878e5e
    • Chuansheng Liu's avatar
      thermal: int340x: fix memory leak in int3400_notify() · 3abea10e
      Chuansheng Liu authored
      It is easy to hit the below memory leaks in my TigerLake platform:
      
      unreferenced object 0xffff927c8b91dbc0 (size 32):
        comm "kworker/0:2", pid 112, jiffies 4294893323 (age 83.604s)
        hex dump (first 32 bytes):
          4e 41 4d 45 3d 49 4e 54 33 34 30 30 20 54 68 65  NAME=INT3400 The
          72 6d 61 6c 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  rmal.kkkkkkkkkk.
        backtrace:
          [<ffffffff9c502c3e>] __kmalloc_track_caller+0x2fe/0x4a0
          [<ffffffff9c7b7c15>] kvasprintf+0x65/0xd0
          [<ffffffff9c7b7d6e>] kasprintf+0x4e/0x70
          [<ffffffffc04cb662>] int3400_notify+0x82/0x120 [int3400_thermal]
          [<ffffffff9c8b7358>] acpi_ev_notify_dispatch+0x54/0x71
          [<ffffffff9c88f1a7>] acpi_os_execute_deferred+0x17/0x30
          [<ffffffff9c2c2c0a>] process_one_work+0x21a/0x3f0
          [<ffffffff9c2c2e2a>] worker_thread+0x4a/0x3b0
          [<ffffffff9c2cb4dd>] kthread+0xfd/0x130
          [<ffffffff9c201c1f>] ret_from_fork+0x1f/0x30
      
      Fix it by calling kfree() accordingly.
      
      Fixes: 38e44da5 ("thermal: int3400_thermal: process "thermal table changed" event")
      Signed-off-by: default avatarChuansheng Liu <chuansheng.liu@intel.com>
      Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      3abea10e
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.17-2022-02-23' of git://git.kernel.dk/linux-block · 3a5f59b1
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Add a conditional schedule point in io_add_buffers() (Eric)
      
       - Fix for a quiesce speedup merged in this release (Dylan)
      
       - Don't convert to jiffies for event timeout waiting, it's way too
         coarse when we accept a timespec as input (me)
      
      * tag 'io_uring-5.17-2022-02-23' of git://git.kernel.dk/linux-block:
        io_uring: disallow modification of rsrc_data during quiesce
        io_uring: don't convert to jiffies for waiting on timeouts
        io_uring: add a schedule point in io_add_buffers()
      3a5f59b1
    • Rafael J. Wysocki's avatar
      Merge branch 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm · c5eb92f5
      Rafael J. Wysocki authored
      Pull ARM cpufreq fixes for 5.18-rc6 from Viresh Kumar:
      
      "This fixes issues related to throttle IRQ for Qcom SoCs."
      
      * 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
        cpufreq: qcom-hw: Delay enabling throttle_irq
        cpufreq: Reintroduce ready() callback
      c5eb92f5
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.17-4' of... · 6c528f34
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull more x86 platform driver fixes from Hans de Goede:
       "Two more fixes:
      
         - Fix suspend/resume regression on AMD Cezanne APUs in >= 5.16
      
         - Fix Microsoft Surface 3 battery readings"
      
      * tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        surface: surface3_power: Fix battery readings on batteries without a serial number
        platform/x86: amd-pmc: Set QOS during suspend on CZN w/ timer wakeup
      6c528f34
    • Mauri Sandberg's avatar
      net: mv643xx_eth: process retval from of_get_mac_address · 42404d8f
      Mauri Sandberg authored
      Obtaining a MAC address may be deferred in cases when the MAC is stored
      in an NVMEM block, for example, and it may not be ready upon the first
      retrieval attempt and return EPROBE_DEFER.
      
      It is also possible that a port that does not rely on NVMEM has been
      already created when getting the defer request. Thus, also the resources
      allocated previously must be freed when doing a roll-back.
      
      Fixes: 76723bca ("net: mv643xx_eth: add DT parsing support")
      Signed-off-by: default avatarMauri Sandberg <maukka@ext.kapsi.fi>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220223142337.41757-1-maukka@ext.kapsi.fiSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      42404d8f
    • Maxim Levitsky's avatar
      KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non... · e910a53f
      Maxim Levitsky authored
      KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled
      
      If nested tsc scaling is disabled, MSR_AMD64_TSC_RATIO should
      never have non default value.
      
      Due to way nested tsc scaling support was implmented in qemu,
      it would set this msr to 0 when nested tsc scaling was disabled.
      Ignore that value for now, as it causes no harm.
      
      Fixes: 5228eb96 ("KVM: x86: nSVM: implement nested TSC scaling")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220223115649.319134-1-mlevitsk@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e910a53f
    • Liang Zhang's avatar
      KVM: x86/mmu: make apf token non-zero to fix bug · 6f3c1fc5
      Liang Zhang authored
      In current async pagefault logic, when a page is ready, KVM relies on
      kvm_arch_can_dequeue_async_page_present() to determine whether to deliver
      a READY event to the Guest. This function test token value of struct
      kvm_vcpu_pv_apf_data, which must be reset to zero by Guest kernel when a
      READY event is finished by Guest. If value is zero meaning that a READY
      event is done, so the KVM can deliver another.
      But the kvm_arch_setup_async_pf() may produce a valid token with zero
      value, which is confused with previous mention and may lead the loss of
      this READY event.
      
      This bug may cause task blocked forever in Guest:
       INFO: task stress:7532 blocked for more than 1254 seconds.
             Not tainted 5.10.0 #16
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
       task:stress          state:D stack:    0 pid: 7532 ppid:  1409
       flags:0x00000080
       Call Trace:
        __schedule+0x1e7/0x650
        schedule+0x46/0xb0
        kvm_async_pf_task_wait_schedule+0xad/0xe0
        ? exit_to_user_mode_prepare+0x60/0x70
        __kvm_handle_async_pf+0x4f/0xb0
        ? asm_exc_page_fault+0x8/0x30
        exc_page_fault+0x6f/0x110
        ? asm_exc_page_fault+0x8/0x30
        asm_exc_page_fault+0x1e/0x30
       RIP: 0033:0x402d00
       RSP: 002b:00007ffd31912500 EFLAGS: 00010206
       RAX: 0000000000071000 RBX: ffffffffffffffff RCX: 00000000021a32b0
       RDX: 000000000007d011 RSI: 000000000007d000 RDI: 00000000021262b0
       RBP: 00000000021262b0 R08: 0000000000000003 R09: 0000000000000086
       R10: 00000000000000eb R11: 00007fefbdf2baa0 R12: 0000000000000000
       R13: 0000000000000002 R14: 000000000007d000 R15: 0000000000001000
      Signed-off-by: default avatarLiang Zhang <zhangliang5@huawei.com>
      Message-Id: <20220222031239.1076682-1-zhangliang5@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6f3c1fc5
    • Xin Long's avatar
      ping: remove pr_err from ping_lookup · cd33bdcb
      Xin Long authored
      As Jakub noticed, prints should be avoided on the datapath.
      Also, as packets would never come to the else branch in
      ping_lookup(), remove pr_err() from ping_lookup().
      
      Fixes: 35a79e64 ("ping: fix the dif and sdif check in ping_lookup")
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Link: https://lore.kernel.org/r/1ef3f2fcd31bd681a193b1fcf235eee1603819bd.1645674068.git.lucien.xin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cd33bdcb
    • Mateusz Palczewski's avatar
      Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC" · fe203715
      Mateusz Palczewski authored
      Revert of a patch that instead of fixing a AQ error when trying
      to reset BW limit introduced several regressions related to
      creation and managing TC. Currently there are errors when creating
      a TC on both PF and VF.
      
      Error log:
      [17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
      [17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391
      [17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
      [17428.783259] i40e 0000:3b:00.1: Unable to  configure TC map 0 for VSI 391
      
      This reverts commit 3d250466.
      
      Fixes: 3d250466 (i40e: Fix reset bw limit when DCB enabled with 1 TC)
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20220223175347.1690692-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fe203715
    • Paul Blakey's avatar
      openvswitch: Fix setting ipv6 fields causing hw csum failure · d9b5ae5c
      Paul Blakey authored
      Ipv6 ttl, label and tos fields are modified without first
      pulling/pushing the ipv6 header, which would have updated
      the hw csum (if available). This might cause csum validation
      when sending the packet to the stack, as can be seen in
      the trace below.
      
      Fix this by updating skb->csum if available.
      
      Trace resulted by ipv6 ttl dec and then sending packet
      to conntrack [actions: set(ipv6(hlimit=63)),ct(zone=99)]:
      [295241.900063] s_pf0vf2: hw csum failure
      [295241.923191] Call Trace:
      [295241.925728]  <IRQ>
      [295241.927836]  dump_stack+0x5c/0x80
      [295241.931240]  __skb_checksum_complete+0xac/0xc0
      [295241.935778]  nf_conntrack_tcp_packet+0x398/0xba0 [nf_conntrack]
      [295241.953030]  nf_conntrack_in+0x498/0x5e0 [nf_conntrack]
      [295241.958344]  __ovs_ct_lookup+0xac/0x860 [openvswitch]
      [295241.968532]  ovs_ct_execute+0x4a7/0x7c0 [openvswitch]
      [295241.979167]  do_execute_actions+0x54a/0xaa0 [openvswitch]
      [295242.001482]  ovs_execute_actions+0x48/0x100 [openvswitch]
      [295242.006966]  ovs_dp_process_packet+0x96/0x1d0 [openvswitch]
      [295242.012626]  ovs_vport_receive+0x6c/0xc0 [openvswitch]
      [295242.028763]  netdev_frame_hook+0xc0/0x180 [openvswitch]
      [295242.034074]  __netif_receive_skb_core+0x2ca/0xcb0
      [295242.047498]  netif_receive_skb_internal+0x3e/0xc0
      [295242.052291]  napi_gro_receive+0xba/0xe0
      [295242.056231]  mlx5e_handle_rx_cqe_mpwrq_rep+0x12b/0x250 [mlx5_core]
      [295242.062513]  mlx5e_poll_rx_cq+0xa0f/0xa30 [mlx5_core]
      [295242.067669]  mlx5e_napi_poll+0xe1/0x6b0 [mlx5_core]
      [295242.077958]  net_rx_action+0x149/0x3b0
      [295242.086762]  __do_softirq+0xd7/0x2d6
      [295242.090427]  irq_exit+0xf7/0x100
      [295242.093748]  do_IRQ+0x7f/0xd0
      [295242.096806]  common_interrupt+0xf/0xf
      [295242.100559]  </IRQ>
      [295242.102750] RIP: 0033:0x7f9022e88cbd
      [295242.125246] RSP: 002b:00007f9022282b20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
      [295242.132900] RAX: 0000000000000005 RBX: 0000000000000010 RCX: 0000000000000000
      [295242.140120] RDX: 00007f9022282ba8 RSI: 00007f9022282a30 RDI: 00007f9014005c30
      [295242.147337] RBP: 00007f9014014d60 R08: 0000000000000020 R09: 00007f90254a8340
      [295242.154557] R10: 00007f9022282a28 R11: 0000000000000246 R12: 0000000000000000
      [295242.161775] R13: 00007f902308c000 R14: 000000000000002b R15: 00007f9022b71f40
      
      Fixes: 3fdbd1ce ("openvswitch: add ipv6 'set' action")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Link: https://lore.kernel.org/r/20220223163416.24096-1-paulb@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d9b5ae5c
    • Niels Dossche's avatar
      ipv6: prevent a possible race condition with lifetimes · 6c0d8833
      Niels Dossche authored
      valid_lft, prefered_lft and tstamp are always accessed under the lock
      "lock" in other places. Reading these without taking the lock may result
      in inconsistencies regarding the calculation of the valid and preferred
      variables since decisions are taken on these fields for those variables.
      Signed-off-by: default avatarNiels Dossche <dossche.niels@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarNiels Dossche <niels.dossche@ugent.be>
      Link: https://lore.kernel.org/r/20220223131954.6570-1-niels.dossche@ugent.beSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6c0d8833
    • Fabio M. De Francesco's avatar
      net/smc: Use a mutex for locking "struct smc_pnettable" · 7ff57e98
      Fabio M. De Francesco authored
      smc_pnetid_by_table_ib() uses read_lock() and then it calls smc_pnet_apply_ib()
      which, in turn, calls mutex_lock(&smc_ib_devices.mutex).
      
      read_lock() disables preemption. Therefore, the code acquires a mutex while in
      atomic context and it leads to a SAC bug.
      
      Fix this bug by replacing the rwlock with a mutex.
      
      Reported-and-tested-by: syzbot+4f322a6d84e991c38775@syzkaller.appspotmail.com
      Fixes: 64e28b52 ("net/smc: add pnet table namespace support")
      Confirmed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Link: https://lore.kernel.org/r/20220223100252.22562-1-fmdefrancesco@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7ff57e98
    • Manish Chopra's avatar
      bnx2x: fix driver load from initrd · e13ad144
      Manish Chopra authored
      Commit b7a49f73 ("bnx2x: Utilize firmware 7.13.21.0") added
      new firmware support in the driver with maintaining older firmware
      compatibility. However, older firmware was not added in MODULE_FIRMWARE()
      which caused missing firmware files in initrd image leading to driver load
      failure from initrd. This patch adds MODULE_FIRMWARE() for older firmware
      version to have firmware files included in initrd.
      
      Fixes: b7a49f73 ("bnx2x: Utilize firmware 7.13.21.0")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215627Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAlok Prasad <palok@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Link: https://lore.kernel.org/r/20220223085720.12021-1-manishc@marvell.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e13ad144
    • Marek Marczykowski-Górecki's avatar
      Revert "xen-netback: Check for hotplug-status existence before watching" · e8240add
      Marek Marczykowski-Górecki authored
      This reverts commit 2afeec08.
      
      The reasoning in the commit was wrong - the code expected to setup the
      watch even if 'hotplug-status' didn't exist. In fact, it relied on the
      watch being fired the first time - to check if maybe 'hotplug-status' is
      already set to 'connected'. Not registering a watch for non-existing
      path (which is the case if hotplug script hasn't been executed yet),
      made the backend not waiting for the hotplug script to execute. This in
      turns, made the netfront think the interface is fully operational, while
      in fact it was not (the vif interface on xen-netback side might not be
      configured yet).
      
      This was a workaround for 'hotplug-status' erroneously being removed.
      But since that is reverted now, the workaround is not necessary either.
      
      More discussion at
      https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#uSigned-off-by: default avatarMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Reviewed-by: default avatarPaul Durrant <paul@xen.org>
      Reviewed-by: default avatarMichael Brown <mbrown@fensystems.co.uk>
      Link: https://lore.kernel.org/r/20220222001817.2264967-2-marmarek@invisiblethingslab.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e8240add
    • Marek Marczykowski-Górecki's avatar
      Revert "xen-netback: remove 'hotplug-status' once it has served its purpose" · 0f4558ae
      Marek Marczykowski-Górecki authored
      This reverts commit 1f256578.
      
      The 'hotplug-status' node should not be removed as long as the vif
      device remains configured. Otherwise the xen-netback would wait for
      re-running the network script even if it was already called (in case of
      the frontent re-connecting). But also, it _should_ be removed when the
      vif device is destroyed (for example when unbinding the driver) -
      otherwise hotplug script would not configure the device whenever it
      re-appear.
      
      Moving removal of the 'hotplug-status' node was a workaround for nothing
      calling network script after xen-netback module is reloaded. But when
      vif interface is re-created (on xen-netback unbind/bind for example),
      the script should be called, regardless of who does that - currently
      this case is not handled by the toolstack, and requires manual
      script call. Keeping hotplug-status=connected to skip the call is wrong
      and leads to not configured interface.
      
      More discussion at
      https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#uSigned-off-by: default avatarMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Reviewed-by: default avatarPaul Durrant <paul@xen.org>
      Link: https://lore.kernel.org/r/20220222001817.2264967-1-marmarek@invisiblethingslab.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0f4558ae
    • Qu Wenruo's avatar
      btrfs: reduce extent threshold for autodefrag · 558732df
      Qu Wenruo authored
      There is a big gap between inode_should_defrag() and autodefrag extent
      size threshold.  For inode_should_defrag() it has a flexible
      @small_write value. For compressed extent is 16K, and for non-compressed
      extent it's 64K.
      
      However for autodefrag extent size threshold, it's always fixed to the
      default value (256K).
      
      This means, the following write sequence will trigger autodefrag to
      defrag ranges which didn't trigger autodefrag:
      
        pwrite 0 8k
        sync
        pwrite 8k 128K
        sync
      
      The latter 128K write will also be considered as a defrag target (if
      other conditions are met). While only that 8K write is really
      triggering autodefrag.
      
      Such behavior can cause extra IO for autodefrag.
      
      Close the gap, by copying the @small_write value into inode_defrag, so
      that later autodefrag can use the same @small_write value which
      triggered autodefrag.
      
      With the existing transid value, this allows autodefrag really to scan
      the ranges which triggered autodefrag.
      
      Although this behavior change is mostly reducing the extent_thresh value
      for autodefrag, I believe in the future we should allow users to specify
      the autodefrag extent threshold through mount options, but that's an
      other problem to consider in the future.
      
      CC: stable@vger.kernel.org # 5.16+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      558732df