1. 06 Mar, 2019 1 commit
    • Arnd Bergmann's avatar
      ipc: Fix building compat mode without sysvipc · 7e89a37c
      Arnd Bergmann authored
      As John Stultz noticed, my y2038 syscall series caused a link
      failure when CONFIG_SYSVIPC is disabled but CONFIG_COMPAT is
      enabled:
      
      arch/arm64/kernel/sys32.o:(.rodata+0x960): undefined reference to `__arm64_compat_sys_old_semctl'
      arch/arm64/kernel/sys32.o:(.rodata+0x980): undefined reference to `__arm64_compat_sys_old_msgctl'
      arch/arm64/kernel/sys32.o:(.rodata+0x9a0): undefined reference to `__arm64_compat_sys_old_shmctl'
      
      Add the missing entries in kernel/sys_ni.c for the new system
      calls.
      
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      7e89a37c
  2. 27 Feb, 2019 1 commit
    • Thomas Gleixner's avatar
      Merge tag 'y2038-syscall-abi' of... · cfbe2716
      Thomas Gleixner authored
      Merge tag 'y2038-syscall-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground into timers/2038
      
      Pull additional syscall ABI cleanup for y2038 from Arnd Bergmann:
      
      This is a follow-up to the y2038 syscall patches already merged in the tip
      tree.  As the final 32-bit RISC-V syscall ABI is still being decided on,
      this is the last chance to make a few corrections to leave out interfaces
      based on 32-bit time_t along with the old off_t and rlimit types.
      
      The series achieves this in a few steps:
      
      - A couple of bug fixes for minor regressions I introduced
        in the original series
      
      - A couple of older patches from Yury Norov that I had never
        merged in the past, these fix up the openat/open_by_handle_at and
        getrlimit/setrlimit syscalls to disallow the old versions of off_t
        and rlimit.
      
      - Hiding the deprecated system calls behind an #ifdef in
        include/uapi/asm-generic/unistd.h
      
      - Change arch/riscv to drop all these ABIs.
      
      Originally, the plan was to also leave these out on C-Sky, but that now
      has a glibc port that uses the older interfaces, so we need to leave
      them in place.
      cfbe2716
  3. 25 Feb, 2019 1 commit
  4. 19 Feb, 2019 5 commits
    • Arnd Bergmann's avatar
      checksyscalls: fix up mq_timedreceive and stat exceptions · 1d5b8233
      Arnd Bergmann authored
      mq_timedreceive was spelled incorrectly, and we need exceptions
      for new architectures that leave out newstat or stat64, implementing
      only statx() now.
      
      Fixes: 48166e6e ("y2038: add 64-bit time_t syscalls to all 32-bit architectures")
      Fixes: bf4b6a7d ("y2038: Remove stat64 family from default syscall set")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      1d5b8233
    • Arnd Bergmann's avatar
      unicore32: Fix __ARCH_WANT_STAT64 definition · 8e9f51a8
      Arnd Bergmann authored
      The __ARCH_WANT_STAT64 macro must be defined before including
      asm-generic/unistd.h. I got this right for everything except
      unicore32.
      
      Fixes: bf4b6a7d ("y2038: Remove stat64 family from default syscall set")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      8e9f51a8
    • Arnd Bergmann's avatar
      asm-generic: Make time32 syscall numbers optional · c8ce48f0
      Arnd Bergmann authored
      We don't want new architectures to even provide the old 32-bit time_t
      based system calls any more, or define the syscall number macros.
      
      Add a new __ARCH_WANT_TIME32_SYSCALLS macro that gets enabled for all
      existing 32-bit architectures using the generic system call table,
      so we don't change any current behavior.
      Since this symbol is evaluated in user space as well, we cannot use
      a Kconfig CONFIG_* macro but have to define it in uapi/asm/unistd.h.
      
      On 64-bit architectures, the same system call numbers mostly refer to
      the system calls we want to keep, as they already pass 64-bit time_t.
      
      As new architectures no longer provide these, we need new exceptions
      in checksyscalls.sh.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      c8ce48f0
    • Yury Norov's avatar
      asm-generic: Drop getrlimit and setrlimit syscalls from default list · 80d7da1c
      Yury Norov authored
      The newer prlimit64 syscall provides all the functionality of getrlimit
      and setrlimit syscalls and adds the pid of target process, so future
      architectures won't need to include getrlimit and setrlimit.
      
      Therefore drop getrlimit and setrlimit syscalls from the generic syscall
      list unless __ARCH_WANT_SET_GET_RLIMIT is defined by the architecture's
      unistd.h prior to including asm-generic/unistd.h, and adjust all
      architectures using the generic syscall list to define it so that no
      in-tree architectures are affected.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-hexagon@vger.kernel.org
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      Signed-off-by: default avatarYury Norov <ynorov@caviumnetworks.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: James Hogan <james.hogan@imgtec.com> [metag]
      Acked-by: Ley Foon Tan <lftan@altera.com> [nios2]
      Acked-by: Stafford Horne <shorne@gmail.com> [openrisc]
      Acked-by: Will Deacon <will.deacon@arm.com> [arm64]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> #arch/arc bits
      Signed-off-by: default avatarYury Norov <ynorov@marvell.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      80d7da1c
    • Yury Norov's avatar
      32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option · 942fa985
      Yury Norov authored
      All new 32-bit architectures should have 64-bit userspace off_t type, but
      existing architectures has 32-bit ones.
      
      To enforce the rule, new config option is added to arch/Kconfig that defaults
      ARCH_32BIT_OFF_T to be disabled for new 32-bit architectures. All existing
      32-bit architectures enable it explicitly.
      
      New option affects force_o_largefile() behaviour. Namely, if userspace
      off_t is 64-bits long, we have no reason to reject user to open big files.
      
      Note that even if architectures has only 64-bit off_t in the kernel
      (arc, c6x, h8300, hexagon, nios2, openrisc, and unicore32),
      a libc may use 32-bit off_t, and therefore want to limit the file size
      to 4GB unless specified differently in the open flags.
      Signed-off-by: default avatarYury Norov <ynorov@caviumnetworks.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarYury Norov <ynorov@marvell.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      942fa985
  5. 18 Feb, 2019 1 commit
  6. 10 Feb, 2019 2 commits
    • Thomas Gleixner's avatar
      Merge tag 'y2038-new-syscalls' of... · 41ea3910
      Thomas Gleixner authored
      Merge tag 'y2038-new-syscalls' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038
      
      Pull y2038 - time64 system calls from Arnd Bergmann:
      
      This series finally gets us to the point of having system calls with 64-bit
      time_t on all architectures, after a long time of incremental preparation
      patches.
      
      There was actually one conversion that I missed during the summer,
      i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
      and review comments.
      
      The following system calls are now added on all 32-bit architectures using
      the same system call numbers:
      
      403 clock_gettime64
      404 clock_settime64
      405 clock_adjtime64
      406 clock_getres_time64
      407 clock_nanosleep_time64
      408 timer_gettime64
      409 timer_settime64
      410 timerfd_gettime64
      411 timerfd_settime64
      412 utimensat_time64
      413 pselect6_time64
      414 ppoll_time64
      416 io_pgetevents_time64
      417 recvmmsg_time64
      418 mq_timedsend_time64
      419 mq_timedreceiv_time64
      420 semtimedop_time64
      421 rt_sigtimedwait_time64
      422 futex_time64
      423 sched_rr_get_interval_time64
      
      Each one of these corresponds directly to an existing system call that
      includes a 'struct timespec' argument, or a structure containing a timespec
      or (in case of clock_adjtime) timeval. Not included here are new versions
      of getitimer/setitimer and getrusage/waitid, which are planned for the
      future but only needed to make a consistent API rather than for correct
      operation beyond y2038. These four system calls are based on 'timeval', and
      it has not been finally decided what the replacement kernel interface will
      use instead.
      
      So far, I have done a lot of build testing across most architectures, which
      has found a number of bugs. Runtime testing so far included testing LTP on
      32-bit ARM with the existing system calls, to ensure we do not regress for
      existing binaries, and a test with a 32-bit x86 build of LTP against a
      modified version of the musl C library that has been adapted to the new
      system call interface [3].  This library can be used for testing on all
      architectures supported by musl-1.1.21, but it is not how the support is
      getting integrated into the official musl release. Official musl support is
      planned but will require more invasive changes to the library.
      
      Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
      Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
      Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
      41ea3910
    • Thomas Gleixner's avatar
      Merge tag 'y2038-syscall-cleanup' of... · fd659cc0
      Thomas Gleixner authored
      Merge tag 'y2038-syscall-cleanup' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038
      
      Pull preparatory work for y2038 changes from Arnd Bergmann:
      
      System call unification and cleanup
      
      The system call tables have diverged a bit over the years, and a number of
      the recent additions never made it into all architectures, for one reason
      or another.
      
      This is an attempt to clean it up as far as we can without breaking
      compatibility, doing a number of steps:
      
       - Add system calls that have not yet been integrated into all architectures
         but that we definitely want there. This includes {,f}statfs64() and
         get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally.
      
       - The s390 compat syscall handling is cleaned up to be more like what we
         do on other architectures, while keeping the 31-bit pointer
         extension. This was merged as a shared branch by the s390 maintainers
         and is included here in order to base the other patches on top.
      
       - Add the separate ipc syscalls on all architectures that traditionally
         only had sys_ipc(). This version is done without support for IPC_OLD
         that is we have in sys_ipc. The new semtimedop_time64 syscall will only
         be added here, not in sys_ipc
      
       - Add syscall numbers for a couple of syscalls that we probably don't need
         everywhere, in particular pkey_* and rseq, for the purpose of symmetry:
         if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I
         expect that any future system calls will get assigned on all platforms
         together, even when they appear to be specific to a single architecture.
      
       - Prepare for having the same system call numbers for any future calls. In
         combination with the generated tables, this hopefully makes it easier to
         add new calls across all architectures together.
      
      All of the above are technically separate from the y2038 work, but are done
      as preparation before we add the new 64-bit time_t system calls everywhere,
      providing a common baseline set of system calls.
      
      I expect that glibc and other libraries that want to use 64-bit time_t will
      require linux-5.1 kernel headers for building in the future, and at a much
      later point may also require linux-5.1 or a later version as the minimum
      kernel at runtime. Having a common baseline then allows the removal of many
      architecture or kernel version specific workarounds.
      fd659cc0
  7. 07 Feb, 2019 13 commits
  8. 06 Feb, 2019 16 commits
    • Arnd Bergmann's avatar
      y2038: add 64-bit time_t syscalls to all 32-bit architectures · 48166e6e
      Arnd Bergmann authored
      This adds 21 new system calls on each ABI that has 32-bit time_t
      today. All of these have the exact same semantics as their existing
      counterparts, and the new ones all have macro names that end in 'time64'
      for clarification.
      
      This gets us to the point of being able to safely use a C library
      that has 64-bit time_t in user space. There are still a couple of
      loose ends to tie up in various areas of the code, but this is the
      big one, and should be entirely uncontroversial at this point.
      
      In particular, there are four system calls (getitimer, setitimer,
      waitid, and getrusage) that don't have a 64-bit counterpart yet,
      but these can all be safely implemented in the C library by wrapping
      around the existing system calls because the 32-bit time_t they
      pass only counts elapsed time, not time since the epoch. They
      will be dealt with later.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      48166e6e
    • Arnd Bergmann's avatar
      y2038: rename old time and utime syscalls · d33c577c
      Arnd Bergmann authored
      The time, stime, utime, utimes, and futimesat system calls are only
      used on older architectures, and we do not provide y2038 safe variants
      of them, as they are replaced by clock_gettime64, clock_settime64,
      and utimensat_time64.
      
      However, for consistency it seems better to have the 32-bit architectures
      that still use them call the "time32" entry points (leaving the
      traditional handlers for the 64-bit architectures), like we do for system
      calls that now require two versions.
      
      Note: We used to always define __ARCH_WANT_SYS_TIME and
      __ARCH_WANT_SYS_UTIME and only set __ARCH_WANT_COMPAT_SYS_TIME and
      __ARCH_WANT_SYS_UTIME32 for compat mode on 64-bit kernels. Now this is
      reversed: only 64-bit architectures set __ARCH_WANT_SYS_TIME/UTIME, while
      we need __ARCH_WANT_SYS_TIME32/UTIME32 for 32-bit architectures and compat
      mode. The resulting asm/unistd.h changes look a bit counterintuitive.
      
      This is only a cleanup patch and it should not change any behavior.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      d33c577c
    • Arnd Bergmann's avatar
      y2038: remove struct definition redirects · c70a772f
      Arnd Bergmann authored
      We now use 64-bit time_t on all architectures, so the __kernel_timex,
      __kernel_timeval and __kernel_timespec redirects can be removed
      after having served their purpose.
      
      This makes it all much less confusing, as the __kernel_* types
      now always refer to the same layout based on 64-bit time_t across
      all 32-bit and 64-bit architectures.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      c70a772f
    • Arnd Bergmann's avatar
      y2038: use time32 syscall names on 32-bit · 00bf25d6
      Arnd Bergmann authored
      This is the big flip, where all 32-bit architectures set COMPAT_32BIT_TIME
      and use the _time32 system calls from the former compat layer instead
      of the system calls that take __kernel_timespec and similar arguments.
      
      The temporary redirects for __kernel_timespec, __kernel_itimerspec
      and __kernel_timex can get removed with this.
      
      It would be easy to split this commit by architecture, but with the new
      generated system call tables, it's easy enough to do it all at once,
      which makes it a little easier to check that the changes are the same
      in each table.
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      00bf25d6
    • Arnd Bergmann's avatar
      syscalls: remove obsolete __IGNORE_ macros · 805089c2
      Arnd Bergmann authored
      These are all for ignoring the lack of obsolete system calls,
      which have been marked the same way in scripts/checksyscall.sh,
      so these can be removed.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      805089c2
    • Arnd Bergmann's avatar
      y2038: syscalls: rename y2038 compat syscalls · 8dabe724
      Arnd Bergmann authored
      A lot of system calls that pass a time_t somewhere have an implementation
      using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
      been reworked so that this implementation can now be used on 32-bit
      architectures as well.
      
      The missing step is to redefine them using the regular SYSCALL_DEFINEx()
      to get them out of the compat namespace and make it possible to build them
      on 32-bit architectures.
      
      Any system call that ends in 'time' gets a '32' suffix on its name for
      that version, while the others get a '_time32' suffix, to distinguish
      them from the normal version, which takes a 64-bit time argument in the
      future.
      
      In this step, only 64-bit architectures are changed, doing this rename
      first lets us avoid touching the 32-bit architectures twice.
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      8dabe724
    • Arnd Bergmann's avatar
      x86/x32: use time64 versions of sigtimedwait and recvmmsg · 7948450d
      Arnd Bergmann authored
      x32 has always followed the time64 calling conventions of these
      syscalls, which required a special hack in compat_get_timespec
      aka get_old_timespec32 to continue working.
      
      Since we now have the time64 syscalls, use those explicitly.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      7948450d
    • Deepa Dinamani's avatar
      timex: change syscalls to use struct __kernel_timex · 3876ced4
      Deepa Dinamani authored
      struct timex is not y2038 safe.
      Switch all the syscall apis to use y2038 safe __kernel_timex.
      
      Note that sys_adjtimex() does not have a y2038 safe solution.  C libraries
      can implement it by calling clock_adjtime(CLOCK_REALTIME, ...).
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      3876ced4
    • Deepa Dinamani's avatar
      timex: use __kernel_timex internally · ead25417
      Deepa Dinamani authored
      struct timex is not y2038 safe.
      Replace all uses of timex with y2038 safe __kernel_timex.
      
      Note that struct __kernel_timex is an ABI interface definition.
      We could define a new structure based on __kernel_timex that
      is only available internally instead. Right now, there isn't
      a strong motivation for this as the structure is isolated to
      a few defined struct timex interfaces and such a structure would
      be exactly the same as struct timex.
      
      The patch was generated by the following coccinelle script:
      
      virtual patch
      
      @depends on patch forall@
      identifier ts;
      expression e;
      @@
      (
      - struct timex ts;
      + struct __kernel_timex ts;
      |
      - struct timex ts = {};
      + struct __kernel_timex ts = {};
      |
      - struct timex ts = e;
      + struct __kernel_timex ts = e;
      |
      - struct timex *ts;
      + struct __kernel_timex *ts;
      |
      (memset \| copy_from_user \| copy_to_user \)(...,
      - sizeof(struct timex))
      + sizeof(struct __kernel_timex))
      )
      
      @depends on patch forall@
      identifier ts;
      identifier fn;
      @@
      fn(...,
      - struct timex *ts,
      + struct __kernel_timex *ts,
      ...) {
      ...
      }
      
      @depends on patch forall@
      identifier ts;
      identifier fn;
      @@
      fn(...,
      - struct timex *ts) {
      + struct __kernel_timex *ts) {
      ...
      }
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: linux-alpha@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      ead25417
    • Arnd Bergmann's avatar
      sparc64: add custom adjtimex/clock_adjtime functions · 1a596398
      Arnd Bergmann authored
      sparc64 is the only architecture on Linux that has a 'timeval'
      definition with a 32-bit tv_usec but a 64-bit tv_sec. This causes
      problems for sparc32 compat mode when we convert it to use the
      new __kernel_timex type that has the same layout as all other
      64-bit architectures.
      
      To avoid adding sparc64 specific code into the generic adjtimex
      implementation, this adds a wrapper in the sparc64 system call handling
      that converts the sparc64 'timex' into the new '__kernel_timex'.
      
      At this point, the two structures are defined to be identical,
      but that will change in the next step once we convert sparc32.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      1a596398
    • Arnd Bergmann's avatar
      time: fix sys_timer_settime prototype · 50b93f30
      Arnd Bergmann authored
      A small typo has crept into the y2038 conversion of the timer_settime
      system call. So far this was completely harmless, but once we start
      using the new version, this has to be fixed.
      
      Fixes: 6ff84735 ("time: Change types to new y2038 safe __kernel_itimerspec")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      50b93f30
    • Deepa Dinamani's avatar
      time: Add struct __kernel_timex · 2c620ff9
      Deepa Dinamani authored
      struct timex uses struct timeval internally.
      struct timeval is not y2038 safe.
      Introduce a new UAPI type struct __kernel_timex
      that is y2038 safe.
      
      struct __kernel_timex uses a timeval type that is
      similar to struct __kernel_timespec which preserves the
      same structure size across 32 bit and 64 bit ABIs.
      struct __kernel_timex also restructures other members of the
      structure to make the structure the same on 64 bit and 32 bit
      architectures.
      Note that struct __kernel_timex is the same as struct timex
      on a 64 bit architecture.
      
      The above solution is similar to other new y2038 syscalls
      that are being introduced: both 32 bit and 64 bit ABIs
      have a common entry, and the compat entry supports the old 32 bit
      syscall interface.
      
      Alternatives considered were:
      1. Add new time type to struct timex that makes use of padded
         bits. This time type could be based on the struct __kernel_timespec.
         modes will use a flag to notify which time structure should be
         used internally.
         This needs some application level changes on both 64 bit and 32 bit
         architectures. Although 64 bit machines could continue to use the
         older timeval structure without any changes.
      
      2. Add a new u8 type to struct timex that makes use of padded bits. This
         can be used to save higher order tv_sec bits. modes will use a flag to
         notify presence of such a type.
         This will need some application level changes on 32 bit architectures.
      
      3. Add a new compat_timex structure that differs in only the size of the
         time type; keep rest of struct timex the same.
         This requires extra syscalls to manage all 3 cases on 64 bit
         architectures. This will not need any application level changes but will
         add more complexity from kernel side.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      2c620ff9
    • Arnd Bergmann's avatar
      time: make adjtime compat handling available for 32 bit · 4d5f007e
      Arnd Bergmann authored
      We want to reuse the compat_timex handling on 32-bit architectures the
      same way we are using the compat handling for timespec when moving to
      64-bit time_t.
      
      Move all definitions related to compat_timex out of the compat code
      into the normal timekeeping code, along with a rename to old_timex32,
      corresponding to the timespec/timeval structures, and make it controlled
      by CONFIG_COMPAT_32BIT_TIME, which 32-bit architectures will then select.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      4d5f007e
    • Mike Snitzer's avatar
      dm: don't use bio_trim() afterall · fa8db494
      Mike Snitzer authored
      bio_trim() has an early return, which makes it _not_ idempotent, if the
      offset is 0 and the bio's bi_size already matches the requested size.
      Prior to DM, all users of bio_trim() were fine with this.  But DM has
      exposed the fact that bio_trim()'s early return is incompatible with a
      cloned bio whose integrity payload must be trimmed via
      bio_integrity_trim().
      
      Fix this by reverting DM back to doing the equivalent of bio_trim() but
      in an idempotent manner (so bio_integrity_trim is always performed).
      
      Follow-on work is needed to assess what benefit bio_trim()'s early
      return is providing to its existing callers.
      Reported-by: default avatarMilan Broz <gmazyland@gmail.com>
      Fixes: 57c36519 ("dm: fix clone_bio() to trigger blk_recount_segments()")
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      fa8db494
    • Mikulas Patocka's avatar
      dm: add memory barrier before waitqueue_active · 645efa84
      Mikulas Patocka authored
      Block core changes to switch bio-based IO accounting to be percpu had a
      side-effect of altering DM core to now rely on calling waitqueue_active
      (in both bio-based and request-based) to check if another task is in
      dm_wait_for_completion().
      
      A memory barrier is needed before calling waitqueue_active().  DM core
      doesn't piggyback on a preceding memory barrier so it must explicitly
      use its own.
      
      For more details on why using waitqueue_active() without a preceding
      barrier is unsafe, please see the comment before the waitqueue_active()
      definition in include/linux/wait.h.
      
      Add the missing memory barrier by switching to using wq_has_sleeper().
      
      Fixes: 6f757231 ("dm: remove the pending IO accounting")
      Fixes: c4576aed ("dm: fix request-based dm's use of dm_wait_for_completion")
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      645efa84
    • Chuck Lever's avatar
      svcrdma: Remove max_sge check at connect time · e248aa7b
      Chuck Lever authored
      Two and a half years ago, the client was changed to use gathered
      Send for larger inline messages, in commit 655fec69 ("xprtrdma:
      Use gathered Send for large inline messages"). Several fixes were
      required because there are a few in-kernel device drivers whose
      max_sge is 3, and these were broken by the change.
      
      Apparently my memory is going, because some time later, I submitted
      commit 25fd86ec ("svcrdma: Don't overrun the SGE array in
      svc_rdma_send_ctxt"), and after that, commit f3c1fd0e ("svcrdma:
      Reduce max_send_sges"). These too incorrectly assumed in-kernel
      device drivers would have more than a few Send SGEs available.
      
      The fix for the server side is not the same. This is because the
      fundamental problem on the server is that, whether or not the client
      has provisioned a chunk for the RPC reply, the server must squeeze
      even the most complex RPC replies into a single RDMA Send. Failing
      in the send path because of Send SGE exhaustion should never be an
      option.
      
      Therefore, instead of failing when the send path runs out of SGEs,
      switch to using a bounce buffer mechanism to handle RPC replies that
      are too complex for the device to send directly. That allows us to
      remove the max_sge check to enable drivers with small max_sge to
      work again.
      Reported-by: default avatarDon Dutile <ddutile@redhat.com>
      Fixes: 25fd86ec ("svcrdma: Don't overrun the SGE array in ...")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      e248aa7b