Commits · fdb6649ab7c142e497539a471e573c2593b9c923 · Kirill Smelkov / linux

20 Sep, 2022 2 commits

x86/asm/bitops: Use __builtin_ctzl() to evaluate constant expressions · fdb6649a

Vincent Mailhol authored Sep 07, 2022

If x is not 0, __ffs(x) is equivalent to:
  (unsigned long)__builtin_ctzl(x)
And if x is not ~0UL, ffz(x) is equivalent to:
  (unsigned long)__builtin_ctzl(~x)
Because __builting_ctzl() returns an int, a cast to (unsigned long) is
necessary to avoid potential warnings on implicit casts.

Concerning the edge cases, __builtin_ctzl(0) is always undefined,
whereas __ffs(0) and ffz(~0UL) may or may not be defined, depending on
the processor. Regardless, for both functions, developers are asked to
check against 0 or ~0UL so replacing __ffs() or ffz() by
__builting_ctzl() is safe.

For x86_64, the current __ffs() and ffz() implementations do not
produce optimized code when called with a constant expression. On the
contrary, the __builtin_ctzl() folds into a single instruction.

However, for non constant expressions, the __ffs() and ffz() asm
versions of the kernel remains slightly better than the code produced
by GCC (it produces a useless instruction to clear eax).

Use __builtin_constant_p() to select between the kernel's
__ffs()/ffz() and the __builtin_ctzl() depending on whether the
argument is constant or not.

** Statistics **

On a allyesconfig, before...:

  $ objdump -d vmlinux.o | grep tzcnt | wc -l
  3607

...and after:

  $ objdump -d vmlinux.o | grep tzcnt | wc -l
  2600

So, roughly 27.9% of the calls to either __ffs() or ffz() were using
constant expressions and could be optimized out.

(tests done on linux v5.18-rc5 x86_64 using GCC 11.2.1)

Note: on x86_64, the BSF instruction produces TZCNT when used with the
REP prefix (which explain the use of `grep tzcnt' instead of `grep bsf'
in above benchmark). c.f. [1]

[1] e26a44a2 ("x86: Use REP BSF unconditionally")

  [ bp: Massage commit message. ]
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20220511160319.1045812-1-mailhol.vincent@wanadoo.fr

fdb6649a

x86/asm/bitops: Use __builtin_ffs() to evaluate constant expressions · 146034fe

Vincent Mailhol authored Sep 07, 2022

For x86_64, the current ffs() implementation does not produce optimized
code when called with a constant expression. On the contrary, the
__builtin_ffs() functions of both GCC and clang are able to fold the
expression into a single instruction.

** Example **

Consider two dummy functions foo() and bar() as below:

  #include <linux/bitops.h>
  #define CONST 0x01000000

  unsigned int foo(void)
  {
  	return ffs(CONST);
  }

  unsigned int bar(void)
  {
  	return __builtin_ffs(CONST);
  }

GCC would produce below assembly code:

  0000000000000000 <foo>:
     0:	ba 00 00 00 01       	mov    $0x1000000,%edx
     5:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
     a:	0f bc c2             	bsf    %edx,%eax
     d:	83 c0 01             	add    $0x1,%eax
    10:	c3                   	ret
  <Instructions after ret and before next function were redacted>

  0000000000000020 <bar>:
    20:	b8 19 00 00 00       	mov    $0x19,%eax
    25:	c3                   	ret

And clang would produce:

  0000000000000000 <foo>:
     0:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
     5:	0f bc 05 00 00 00 00 	bsf    0x0(%rip),%eax        # c <foo+0xc>
     c:	83 c0 01             	add    $0x1,%eax
     f:	c3                   	ret

  0000000000000010 <bar>:
    10:	b8 19 00 00 00       	mov    $0x19,%eax
    15:	c3                   	ret

Both examples clearly demonstrate the benefit of using __builtin_ffs()
instead of the kernel's asm implementation for constant expressions.

However, for non constant expressions, the kernel's ffs() asm version
remains better for x86_64 because, contrary to GCC, it doesn't emit the
CMOV assembly instruction, c.f. [1] (noticeably, clang is able optimize
out the CMOV call).

Use __builtin_constant_p() to select between the kernel's ffs() and
the __builtin_ffs() depending on whether the argument is constant or
not.

As a side benefit, replacing the ffs() function declaration by a macro
also removes below -Wshadow warning:

  ./arch/x86/include/asm/bitops.h:283:28: warning: declaration of 'ffs' shadows a built-in function [-Wshadow]
    283 | static __always_inline int ffs(int x)

** Statistics **

On a allyesconfig, before...:

  $ objdump -d vmlinux.o | grep bsf | wc -l
  1081

...and after:

  $ objdump -d vmlinux.o | grep bsf | wc -l
  792

So, roughly 26.7% of the calls to ffs() were using constant
expressions and could be optimized out.

(tests done on linux v5.18-rc5 x86_64 using GCC 11.2.1)

[1] commit ca3d30cc ("x86_64, asm: Optimise fls(), ffs() and fls64()")

  [ bp: Massage commit message. ]
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20220511160319.1045812-1-mailhol.vincent@wanadoo.fr

146034fe

18 Sep, 2022 5 commits

Linux 6.0-rc6 · 521a547c
Linus Torvalds authored Sep 18, 2022

521a547c

Merge tag 'parisc-for-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 7c18b453

Linus Torvalds authored Sep 18, 2022

Pull parisc architecture fixes from Helge Deller:
 "Some small parisc architecture fixes for 6.0-rc6:

  One patch lightens up a previous commit and thus unbreaks building the
  debian kernel, which tries to configure a 64-bit kernel with the
  ARCH=parisc environment variable set.

  The other patches fixes asm/errno.h includes in the tools directory
  and cleans up memory allocation in the iosapic driver.

  Summary:

   - Allow configuring 64-bit kernel with ARCH=parisc

   - Fix asm/errno.h includes in tools directory for parisc and xtensa

   - Clean up iosapic memory allocation

   - Minor typo and spelling fixes"

* tag 'parisc-for-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Allow CONFIG_64BIT with ARCH=parisc
  parisc: remove obsolete manual allocation aligning in iosapic
  tools/include/uapi: Fix <asm/errno.h> for parisc and xtensa
  Input: hp_sdc: fix spelling typo in comment
  parisc: ccio-dma: Add missing iounmap in error path in ccio_probe()

7c18b453

Merge tag 'io_uring-6.0-2022-09-18' of git://git.kernel.dk/linux · 38eddeed

Linus Torvalds authored Sep 18, 2022

Pull io_uring fixes from Jens Axboe:
 "Nothing really major here, but figured it'd be nicer to just get these
  flushed out for -rc6 so that the 6.1 branch will have them as well.
  That'll make our lives easier going forward in terms of development,
  and avoid trivial conflicts in this area.

   - Simple trace rename so that the returned opcode name is consistent
     with the enum definition (Stefan)

   - Send zc rsrc request vs notification lifetime fix (Pavel)"

* tag 'io_uring-6.0-2022-09-18' of git://git.kernel.dk/linux:
  io_uring/opdef: rename SENDZC_NOTIF to SEND_ZC
  io_uring/net: fix zc fixed buf lifetime

38eddeed

io_uring/opdef: rename SENDZC_NOTIF to SEND_ZC · 9bd3f728

Stefan Metzmacher authored Sep 16, 2022

It's confusing to see the string SENDZC_NOTIF in ftrace output
when using IORING_OP_SEND_ZC.

Fixes: b48c312b ("io_uring/net: simplify zerocopy send user API")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: io-uring@vger.kernel.org
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8e5cd8616919c92b6c3c7b6ea419fdffd5b97f3c.1663363798.git.metze@samba.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

9bd3f728

io_uring/net: fix zc fixed buf lifetime · e3366e02

Pavel Begunkov authored Sep 16, 2022

Notifications usually outlive requests, so we need to pin buffers with
it by assigning a rsrc to it instead of the request.

Fixed: b48c312b ("io_uring/net: simplify zerocopy send user API")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/dd6406ff8a90887f2b36ed6205dac9fda17c1f35.1663366886.git.asml.silence@gmail.comReviewed-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

e3366e02

16 Sep, 2022 9 commits

Merge tag 'gpio-fixes-for-v6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · a335366b

Linus Torvalds authored Sep 16, 2022

Pull gpio fixes from Bartosz Golaszewski:

 - fix the level-low interrupt type support in gpio-mpc8xxx

 - convert another two drivers to using immutable irq chips

 - MAINTAINERS update

* tag 'gpio-fixes-for-v6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: mt7621: Make the irqchip immutable
  gpio: ixp4xx: Make irqchip immutable
  MAINTAINERS: Update HiSilicon GPIO Driver maintainer
  gpio: mpc8xxx: Fix support for IRQ_TYPE_LEVEL_LOW flow_type in mpc85xx

a335366b

Merge tag 'pinctrl-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 6879c2d3

Linus Torvalds authored Sep 16, 2022

Pull pin control fixes from Linus Walleij:
 "Nothing special, just driver fixes:

   - Fix IRQ wakeup and pins for UFS and SDC2 issues on the Qualcomm
     SC8180x

   - Fix the Rockchip driver to support interrupt on both rising and
     falling edges.

   - Name the Allwinner A100 R_PIO properly

   - Fix several issues with the Ocelot interrupts"

* tag 'pinctrl-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: ocelot: Fix interrupt controller
  pinctrl: sunxi: Fix name for A100 R_PIO
  pinctrl: rockchip: Enhance support for IRQ_TYPE_EDGE_BOTH
  pinctrl: qcom: sc8180x: Fix wrong pin numbers
  pinctrl: qcom: sc8180x: Fix gpio_wakeirq_map

6879c2d3

Merge tag 'block-6.0-2022-09-16' of git://git.kernel.dk/linux-block · 68e777e4

Linus Torvalds authored Sep 16, 2022

Pull block fixes from Jens Axboe:
 "Two fixes for -rc6:

   - Fix a mixup of sectors and bytes in the secure erase ioctl
     (Mikulas)

   - Fix for a bad return value for a non-blocking bio/blk queue enter
     call (me)"

* tag 'block-6.0-2022-09-16' of git://git.kernel.dk/linux-block:
  blk-lib: fix blkdev_issue_secure_erase
  block: blk_queue_enter() / __bio_queue_enter() must return -EAGAIN for nowait

68e777e4

Merge tag 'io_uring-6.0-2022-09-16' of git://git.kernel.dk/linux-block · 0158137d

Linus Torvalds authored Sep 16, 2022

Pull io_uring fixes from Jens Axboe:
 "Two small patches:

   - Fix using an unsigned type for the return value, introduced in this
     release (Pavel)

   - Stable fix for a missing check for a fixed file on put (me)"

* tag 'io_uring-6.0-2022-09-16' of git://git.kernel.dk/linux-block:
  io_uring/msg_ring: check file type before putting
  io_uring/rw: fix error'ed retry return values

0158137d

Merge tag 'drm-fixes-2022-09-16' of git://anongit.freedesktop.org/drm/drm · 5763d7f2

Linus Torvalds authored Sep 16, 2022

Pull drm fixes from Dave Airlie:
 "This is the regular drm fixes pull.

  The i915 and misc fixes are fairly regular, but the amdgpu contains
  fixes for new hw blocks, the dcn314 specific path hookups and also has
  a bunch of fixes for clang stack size warnings which are a bit churny
  but fairly straightforward. This means it looks a little larger than
  usual.

  amdgpu:
   - BACO fixes for some RDNA2 boards
   - PCI AER fixes uncovered by a core PCI change
   - Properly hook up dirtyfb helper
   - RAS fixes for GC 11.x
   - TMR fix
   - DCN 3.2.x fixes
   - DCN 3.1.4 fixes
   - LLVM DML stack size fixes

  i915:
   - Revert a display patch around max DP source rate now that the
     proper WaEdpLinkRateDataReload is in place
   - Fix perf limit reasons bit position
   - Fix unclaimmed mmio registers on suspend flow with GuC
   - A vma_move_to_active fix for a regression with video decoding
   - DP DSP fix

  gma500:
   - Locking and IRQ fixes

  meson:
   - OSD1 display fixes

  panel-edp:
   - Fix Innolux timings

  rockchip:
   - DP/HDMI fixes"

* tag 'drm-fixes-2022-09-16' of git://anongit.freedesktop.org/drm/drm: (42 commits)
  drm/amdgpu: make sure to init common IP before gmc
  drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega
  drm/amdgpu: move nbio ih_doorbell_range() into ih code for vega
  drm/rockchip: Fix return type of cdn_dp_connector_mode_valid
  drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage
  drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule()
  drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()
  drm/amd/display: Reduce number of arguments of dml32_CalculatePrefetchSchedule()
  drm/amd/display: Reduce number of arguments of dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
  drm/amd/display: Refactor SubVP calculation to remove FPU
  drm/amd/display: Limit user regamma to a valid value
  drm/amd/display: add workaround for subvp cursor corruption for DCN32/321
  drm/amd/display: SW cursor fallback for SubVP
  drm/amd/display: Round cursor width up for MALL allocation
  drm/amd/display: Correct dram channel width for dcn314
  drm/amd/display: Relax swizzle checks for video non-RGB formats on DCN314
  drm/amd/display: Hook up DCN314 specific dml implementation
  drm/amd/display: Enable dlg and vba compilation for dcn314
  drm/amd/display: Fix compilation errors on DCN314
  drm/amd/display: Fix divide by zero in DML
  ...

5763d7f2

Merge tag '6.0-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 714820c6

Linus Torvalds authored Sep 16, 2022

Pull cifs fixes from Steve French:
 "Four smb3 fixes for stable:

   - important fix to revalidate mapping when doing direct writes

   - missing spinlock

   - two fixes to socket handling

   - trivial change to update internal version number for cifs.ko"

* tag '6.0-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module number
  cifs: add missing spinlock around tcon refcount
  cifs: always initialize struct msghdr smb_msg completely
  cifs: don't send down the destination address to sendmsg for a SOCK_STREAM
  cifs: revalidate mapping when doing direct writes

714820c6

Merge tag 'drm-intel-fixes-2022-09-15' of... · 25100377

Dave Airlie authored Sep 16, 2022

Merge tag 'drm-intel-fixes-2022-09-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Revert a display patch around max DP source rate now
  that the proper WaEdpLinkRateDataReload is in place. (Ville)
- Fix perf limit reasons bit position. (Ashutosh)
- Fix unclaimmed mmio registers on suspend flow with GuC. (Umesh)
- A vma_move_to_active fix for a regression with video decoding. (Nirmoy)
- DP DSP fix. (Ankit)
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YyMtmGMXRLsURoM5@intel.com

25100377

Merge tag 'drm-misc-fixes-2022-09-15' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 87d9862b

Dave Airlie authored Sep 16, 2022

Short summary of fixes pull:

 * gma500: Locking and IRQ fixes
 * meson: OSD1 display fixes
 * panel-edp: Fix Innolux timings
 * rockchip: DP/HDMI fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YyMUpP1w21CPXq+I@linux-uq9g

87d9862b

Merge tag 'amd-drm-fixes-6.0-2022-09-14' of... · e2111ae2

Dave Airlie authored Sep 16, 2022

Merge tag 'amd-drm-fixes-6.0-2022-09-14' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.0-2022-09-14:

amdgpu:
- BACO fixes for some RDNA2 boards
- PCI AER fixes uncovered by a core PCI change
- Properly hook up dirtyfb helper
- RAS fixes for GC 11.x
- TMR fix
- DCN 3.2.x fixes
- DCN 3.1.4 fixes
- LLVM DML stack size fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220914184030.6145-1-alexander.deucher@amd.com

e2111ae2

15 Sep, 2022 4 commits

io_uring/msg_ring: check file type before putting · fc7222c3

Jens Axboe authored Sep 15, 2022

If we're invoked with a fixed file, follow the normal rules of not
calling io_fput_file(). Fixed files are permanently registered to the
ring, and do not need putting separately.

Cc: stable@vger.kernel.org
Fixes: aa184e86 ("io_uring: don't attempt to IOPOLL for MSG_RING requests")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fc7222c3

blk-lib: fix blkdev_issue_secure_erase · c4fa3684

Mikulas Patocka authored Sep 14, 2022

There's a bug in blkdev_issue_secure_erase. The statement
"unsigned int len = min_t(sector_t, nr_sects, max_sectors);"
sets the variable "len" to the length in sectors, but the statement
"bio->bi_iter.bi_size = len" treats it as if it were in bytes.
The statements "sector += len << SECTOR_SHIFT" and "nr_sects -= len <<
SECTOR_SHIFT" are thinko.

This patch fixes it.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v5.19
Fixes: 44abff2c ("block: decouple REQ_OP_SECURE_ERASE from REQ_OP_DISCARD")
Link: https://lore.kernel.org/r/alpine.LRH.2.02.2209141549480.28100@file01.intranet.prod.int.rdu2.redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

c4fa3684

parisc: Allow CONFIG_64BIT with ARCH=parisc · 805ce861

Helge Deller authored Sep 13, 2022

The previous patch triggered a build failure for the debian kernel,
which has CONFIG_64BIT enabled, uses the CROSS_COMPILER environment
variable and uses ARCH=parisc to configure the kernel for 64-bit
support.

This patch weakens the previous patch while keeping the recommended way
to configure the kernel with:
    ARCH=parisc     -> build 32-bit kernel
    ARCH=parisc64   -> build 64-bit kernel
while adding the possibility for debian to configure a 64-bit kernel
even if ARCH=parisc is set (PA8X00 CPU has to be selected and
CONFIG_64BIT needs to be enabled).

The downside of this patch is, that we now have a small window open
again where people may get it wrong: if they enable CONFIG_64BIT and try
to compile with a 32-bit compiler.

Fixes: 3dcfb729 ("parisc: Make CONFIG_64BIT available for ARCH=parisc64 only")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # 5.15+

805ce861

parisc: remove obsolete manual allocation aligning in iosapic · e359b70c

Rolf Eike Beer authored Sep 14, 2022

kmalloc() returns memory with __assume_kmalloc_alignment, which is
__alignof__(unsigned long long) for parisc.
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Helge Deller <deller@gmx.de>

e359b70c

14 Sep, 2022 11 commits

drm/amdgpu: make sure to init common IP before gmc · a8671493

Alex Deucher authored Aug 30, 2022

Move common IP init before GMC init so that HDP gets
remapped before GMC init which uses it.

This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

a8671493

drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega · e3163bc8

Alex Deucher authored Sep 09, 2022

This mirrors what we do for other asics and this way we are
sure the sdma doorbell range is properly initialized.

There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized.  However, the statement says that it applies to
multimedia as well, but the VCN code currently initializes
doorbells after GFX and there are no known issues there.  In my
testing at least I don't see any problems on SDMA.

This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

e3163bc8

drm/amdgpu: move nbio ih_doorbell_range() into ih code for vega · dc1d85cb

Alex Deucher authored Sep 09, 2022

This mirrors what we do for other asics and this way we are
sure the ih doorbell range is properly initialized.

There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized.  In this case IH is initialized before GFX,
so there should be no issue.

This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

dc1d85cb

pinctrl: ocelot: Fix interrupt controller · c297561b

Horatiu Vultur authored Sep 09, 2022

When an external device generated a level based interrupt then the
interrupt controller could miss the interrupt. The reason is that the
interrupt controller can detect only link changes.

In the following example, if there is a PHY that generates an interrupt
then the following would happen. The GPIO detected that the interrupt
line changed, and then the 'ocelot_irq_handler' was called. Here it
detects which GPIO line saw the change and for that will call the
following:
1. irq_mask
2. phy interrupt routine
3. irq_eoi
4. irq_unmask

And this works fine for simple cases, but if the PHY generates many
interrupts, for example when doing PTP timestamping, then the following
could happen. Again the function 'ocelot_irq_handler' will be called
and then from here the following could happen:
1. irq_mask
2. phy interrupt routine
3. irq_eoi
4. irq_unmask

Right before step 3(irq_eoi), the PHY will generate another interrupt.
Now the interrupt controller will acknowledge the change in the
interrupt line. So we miss the interrupt.

A solution will be to use 'handle_level_irq' instead of
'handle_fasteoi_irq', because for this will change routine order of
handling the interrupt.
1. irq_mask
2. irq_ack
3. phy interrupt routine
4. irq_unmask

And now if the PHY will generate a new interrupt before irq_unmask, the
interrupt controller will detect this because it already acknowledge the
change in interrupt line at step 2(irq_ack).

But this is not the full solution because there is another issue. In
case there are 2 PHYs that share the interrupt line. For example phy1
generates an interrupt, then the following can happen:
1.irq_mask
2.irq_ack
3.phy0 interrupt routine
4.phy1 interrupt routine
5.irq_unmask

In case phy0 will generate an interrupt while clearing the interrupt
source in phy1, then the interrupt line will be kept down by phy0. So
the interrupt controller will not see any changes in the interrupt line.
The solution here is to update 'irq_unmask' such that it can detect if
the interrupt line is still active or not. And if it is active then call
again the procedure to clear the interrupts. But we don't want to do it
every time, only if we know that the interrupt controller has not seen
already that the interrupt line has changed.

While at this, add support also for IRQ_TYPE_LEVEL_LOW.

Fixes: be36abb7 ("pinctrl: ocelot: add support for interrupt controller")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20220909145942.844102-1-horatiu.vultur@microchip.comSigned-off-by: Linus Walleij <linus.walleij@linaro.org>

c297561b

gpio: mt7621: Make the irqchip immutable · 09eed5a1

Sergio Paracuellos authored Sep 13, 2022

Commit 6c846d02 ("gpio: Don't fiddle with irqchips marked as
immutable") added a warning to indicate if the gpiolib is altering the
internals of irqchips. Following this change the following warnings
are now observed for the mt7621 driver:

gpio gpiochip0: (1e000600.gpio-bank0): not an immutable chip, please consider fixing it!
gpio gpiochip1: (1e000600.gpio-bank1): not an immutable chip, please consider fixing it!
gpio gpiochip2: (1e000600.gpio-bank2): not an immutable chip, please consider fixing it!

Fix this by making the irqchip in the mt7621 driver immutable.
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>

09eed5a1

Merge tag 'devicetree-fixes-for-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 3245cb65

Linus Torvalds authored Sep 14, 2022

Pull devicetree fixes from Rob Herring:

 - Update some stale binding maintainer emails

 - Fix property name error in apple,aic binding

 - Add missing param to of_dma_configure_id() stub

 - Fix an off-by-one error in unflatten_dt_nodes()

* tag 'devicetree-fixes-for-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: pinctrl: qcom: drop non-working codeaurora.org emails
  dt-bindings: power: qcom,rpmpd: drop non-working codeaurora.org emails
  dt-bindings: apple,aic: Fix required item "apple,fiq-index" in affinity description
  dt-bindings: interconnect: fsl,imx8m-noc: drop Leonard Crestez
  of/device: Fix up of_dma_configure_id() stub
  MAINTAINERS: Update email of Neil Armstrong
  of: fdt: fix off-by-one error in unflatten_dt_nodes()

3245cb65

cifs: update internal module number · 8af8aed9
Steve French authored Sep 13, 2022
```
To 2.39
Signed-off-by: Steve French <stfrench@microsoft.com>
```
8af8aed9

cifs: add missing spinlock around tcon refcount · 621a41ae

Paulo Alcantara authored Sep 13, 2022

Add missing spinlock to protect updates on tcon refcount in
cifs_put_tcon().

Fixes: d7d7a66a ("cifs: avoid use of global locks for high contention data")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

621a41ae

drm/rockchip: Fix return type of cdn_dp_connector_mode_valid · b0b9408f

Nathan Huckleberry authored Sep 13, 2022

The mode_valid field in drm_connector_helper_funcs is expected to be of
type:
enum drm_mode_status (* mode_valid) (struct drm_connector *connector,
				     struct drm_display_mode *mode);

The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition.

The return type of cdn_dp_connector_mode_valid should be changed from
int to enum drm_mode_status.
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Cc: llvm@lists.linux.dev
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913205555.155149-1-nhuck@google.com

b0b9408f

cifs: always initialize struct msghdr smb_msg completely · bedc8f76

Stefan Metzmacher authored Sep 14, 2022

So far we were just lucky because the uninitialized members
of struct msghdr are not used by default on a SOCK_STREAM tcp
socket.

But as new things like msg_ubuf and sg_from_iter where added
recently, we should play on the safe side and avoid potention
problems in future.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

bedc8f76

cifs: don't send down the destination address to sendmsg for a SOCK_STREAM · 17d3df38

Stefan Metzmacher authored Sep 14, 2022

This is ignored anyway by the tcp layer.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Cc: stable@vger.kernel.org
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>

17d3df38

13 Sep, 2022 9 commits

block: blk_queue_enter() / __bio_queue_enter() must return -EAGAIN for nowait · 56f99b8d

Stefan Roesch authored Sep 12, 2022

Today blk_queue_enter() and __bio_queue_enter() return -EBUSY for the
nowait code path. This is not correct: they should return -EAGAIN
instead.

This problem was detected by fio. The following command exposed the
above problem:

t/io_uring -p0 -d128 -b4096 -s32 -c32 -F1 -B0 -R0 -X1 -n24 -P1 -u1 -O0 /dev/ng0n1

By applying the patch, the retry case is handled correctly in the slow
path.
Signed-off-by: Stefan Roesch <shr@fb.com>
Fixes: bfd343aa ("blk-mq: don't wait in blk_mq_queue_enter() if __GFP_WAIT isn't set")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

56f99b8d

drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage · 41012d71

Nathan Chancellor authored Aug 30, 2022

This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
^
1 error generated.

Commit a0f7e7f7 ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.

To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Tested-by: Maíra Canal <mairacanal@riseup.net>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

41012d71

drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule() · 21485d3d

Nathan Chancellor authored Aug 30, 2022

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
      ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Tested-by: Maíra Canal <mairacanal@riseup.net>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

21485d3d

drm/amd/display: Reduce number of arguments of dml31's... · 37934d41

Nathan Chancellor authored Aug 30, 2022

drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
      ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Tested-by: Maíra Canal <mairacanal@riseup.net>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

37934d41

drm/amd/display: Reduce number of arguments of dml32_CalculatePrefetchSchedule() · a3fef74b

Nathan Chancellor authored Aug 30, 2022

Several of the arguments are identical between the two call sites and
they can be accessed through the 'struct vba_vars_st' pointer. This
reduces the total amount of stack space that
dml32_ModeSupportAndSystemConfigurationFull() uses by 208 bytes with
LLVM 16 (1936 -> 1728), helping clear up the following clang warning:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
       ^
  1 error generated.

Additionally, while modifying the arguments to
dml32_CalculatePrefetchSchedule(), use 'v' consistently, instead of 'v'
mixed with 'mode_lib->vba'.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Tested-by: Maíra Canal <mairacanal@riseup.net>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a3fef74b

drm/amd/display: Reduce number of arguments of... · c4be0ac9

Nathan Chancellor authored Aug 30, 2022

drm/amd/display: Reduce number of arguments of dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer created at the
top of dml32_ModeSupportAndSystemConfigurationFull(). This reduces the
total amount of stack space that
dml32_ModeSupportAndSystemConfigurationFull() uses by 216 bytes with
LLVM 16 (2152 -> 1936), helping clear up the following clang warning:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
       ^
  1 error generated.

Additionally, while modifying the arguments to
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(), use 'v'
consistently, instead of 'v' mixed with 'mode_lib->vba'.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Tested-by: Maíra Canal <mairacanal@riseup.net>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c4be0ac9

drm/amd/display: Refactor SubVP calculation to remove FPU · d978c51f

Alvin Lee authored Aug 06, 2022

Refactor calculation to remove floating point operations from dmub_srv.
To ensure that 32-bit compilation works well, we use the div64 family of
macros to do integer division for SubVP-related timing parameters.

Cc: Maíra Canal <mairacanal@riseup.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Isabella Basso <isabbasso@riseup.net>
Cc: Magali Lemes <magalilemes00@gmail.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Samson Tam <Samson.Tam@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Co-developed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Co-developed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d978c51f

drm/amd/display: Limit user regamma to a valid value · 3601d620

Yao Wang1 authored Aug 22, 2022

[Why]
For HDR mode, we get total 512 tf_point and after switching to SDR mode
we actually get 400 tf_point and the rest of points(401~512) still use
dirty value from HDR mode. We should limit the rest of the points to max
value.

[How]
Limit the value when coordinates_x.x > 1, just like what we do in
translate_from_linear_space for other re-gamma build paths.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com>
Signed-off-by: Yao Wang1 <Yao.Wang1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3601d620

drm/amd/display: add workaround for subvp cursor corruption for DCN32/321 · ceb75600

Aurabindo Pillai authored Aug 25, 2022

[Why&How]
Kernel does not have a means to tell the userspace to use software
cursor. Due to lack of this functionality, reducing the max cursor size
is the only way to ensure that power savings of Subview port feature is
utilized for asics that support it. The workaround could be removed
after cursor caching is fixed while a subviewport config is active.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ceb75600