1. 22 Jan, 2019 9 commits
    • Christian Brauner's avatar
      binderfs: switch from d_add() to d_instantiate() · 01684db9
      Christian Brauner authored
      In a previous commit we switched from a d_alloc_name() + d_lookup()
      combination to setup a new dentry and find potential duplicates to the more
      idiomatic lookup_one_len(). As far as I understand, this also means we need
      to switch from d_add() to d_instantiate() since lookup_one_len() will
      create a new dentry when it doesn't find an existing one and add the new
      dentry to the hash queues. So we only need to call d_instantiate() to
      connect the dentry to the inode and turn it into a positive dentry.
      
      If we were to use d_add() we sure see stack traces like the following
      indicating that adding the same dentry twice over the same inode:
      
      [  744.441889] CPU: 4 PID: 2849 Comm: landscape-sysin Not tainted 5.0.0-rc1-brauner-binderfs #243
      [  744.441889] Hardware name: Dell      DCS XS24-SC2          /XS24-SC2              , BIOS S59_3C20 04/07/2011
      [  744.441889] RIP: 0010:__d_lookup_rcu+0x76/0x190
      [  744.441889] Code: 89 75 c0 49 c1 e9 20 49 89 fd 45 89 ce 41 83 e6 07 42 8d 04 f5 00 00 00 00 89 45 c8 eb 0c 48 8b 1b 48 85 db 0f 84 81 00 00 00 <44> 8b 63 fc 4c 3b 6b 10 75 ea 48 83 7b 08 00 74 e3 41 83 e4 fe 41
      [  744.441889] RSP: 0018:ffffb8c984e27ad0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
      [  744.441889] RAX: 0000000000000038 RBX: ffff9407ef770c08 RCX: ffffb8c980011000
      [  744.441889] RDX: ffffb8c984e27b54 RSI: ffffb8c984e27ce0 RDI: ffff9407e6689600
      [  744.441889] RBP: ffffb8c984e27b28 R08: ffffb8c984e27ba4 R09: 0000000000000007
      [  744.441889] R10: ffff9407e5c4f05c R11: 973f3eb9d84a94e5 R12: 0000000000000002
      [  744.441889] R13: ffff9407e6689600 R14: 0000000000000007 R15: 00000007bfef7a13
      [  744.441889] FS:  00007f0db13bb740(0000) GS:ffff9407f3b00000(0000) knlGS:0000000000000000
      [  744.441889] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  744.441889] CR2: 00007f0dacc51024 CR3: 000000032961a000 CR4: 00000000000006e0
      [  744.441889] Call Trace:
      [  744.441889]  lookup_fast+0x53/0x300
      [  744.441889]  walk_component+0x49/0x350
      [  744.441889]  ? inode_permission+0x63/0x1a0
      [  744.441889]  link_path_walk.part.33+0x1bc/0x5a0
      [  744.441889]  ? path_init+0x190/0x310
      [  744.441889]  path_lookupat+0x95/0x210
      [  744.441889]  filename_lookup+0xb6/0x190
      [  744.441889]  ? __check_object_size+0xb8/0x1b0
      [  744.441889]  ? strncpy_from_user+0x50/0x1a0
      [  744.441889]  user_path_at_empty+0x36/0x40
      [  744.441889]  ? user_path_at_empty+0x36/0x40
      [  744.441889]  vfs_statx+0x76/0xe0
      [  744.441889]  __do_sys_newstat+0x3d/0x70
      [  744.441889]  __x64_sys_newstat+0x16/0x20
      [  744.441889]  do_syscall_64+0x5a/0x120
      [  744.441889]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  744.441889] RIP: 0033:0x7f0db0ec2775
      [  744.441889] Code: 00 00 00 75 05 48 83 c4 18 c3 e8 26 55 02 00 66 0f 1f 44 00 00 83 ff 01 48 89 f0 77 30 48 89 c7 48 89 d6 b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 03 f3 c3 90 48 8b 15 e1 b6 2d 00 f7 d8 64 89
      [  744.441889] RSP: 002b:00007ffc36bc9388 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
      [  744.441889] RAX: ffffffffffffffda RBX: 00007ffc36bc9300 RCX: 00007f0db0ec2775
      [  744.441889] RDX: 00007ffc36bc9400 RSI: 00007ffc36bc9400 RDI: 00007f0dad26f050
      [  744.441889] RBP: 0000000000c0bc60 R08: 0000000000000000 R09: 0000000000000001
      [  744.441889] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc36bc9400
      [  744.441889] R13: 0000000000000001 R14: 00000000ffffff9c R15: 0000000000c0bc60
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01684db9
    • Christian Brauner's avatar
      binderfs: drop lock in binderfs_binder_ctl_create · 29ef1c8e
      Christian Brauner authored
      The binderfs_binder_ctl_create() call is a no-op on subsequent calls and
      the first call is done before we unlock the suberblock. Hence, there is no
      need to take inode_lock() in there. Let's remove it.
      Suggested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29ef1c8e
    • Christian Brauner's avatar
      binderfs: kill_litter_super() before cleanup · 41984795
      Christian Brauner authored
      Al pointed out that first calling kill_litter_super() before cleaning up
      info is more correct since destroying info doesn't depend on the state of
      the dentries and inodes. That the opposite remains true is not guaranteed.
      Suggested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      41984795
    • Christian Brauner's avatar
      binderfs: rework binderfs_binder_device_create() · 01b3f1fc
      Christian Brauner authored
      - switch from d_alloc_name() + d_lookup() to lookup_one_len():
        Instead of using d_alloc_name() and then doing a d_lookup() with the
        allocated dentry to find whether a device with the name we're trying to
        create already exists switch to using lookup_one_len().  The latter will
        either return the existing dentry or a new one.
      
      - switch from kmalloc() + strscpy() to kmemdup():
        Use a more idiomatic way to copy the name for the new dentry that
        userspace gave us.
      Suggested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01b3f1fc
    • Christian Brauner's avatar
      binderfs: rework binderfs_fill_super() · 36975fc3
      Christian Brauner authored
      Al pointed out that on binderfs_fill_super() error
      deactivate_locked_super() will call binderfs_kill_super() so all of the
      freeing and putting we currently do in binderfs_fill_super() is unnecessary
      and buggy. Let's simply return errors and let binderfs_fill_super() take
      care of cleaning up on error.
      Suggested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36975fc3
    • Christian Brauner's avatar
      binderfs: prevent renaming the control dentry · e98e6fa1
      Christian Brauner authored
      - make binderfs control dentry immutable:
        We don't allow to unlink it since it is crucial for binderfs to be
        useable but if we allow to rename it we make the unlink trivial to
        bypass. So prevent renaming too and simply treat the control dentry as
        immutable.
      
      - add is_binderfs_control_device() helper:
        Take the opportunity and turn the check for the control dentry into a
        separate helper is_binderfs_control_device() since it's now used in two
        places.
      
      - simplify binderfs_rename():
        Instead of hand-rolling our custom version of simple_rename() just dumb
        the whole function down to first check whether we're trying to rename the
        control dentry. If we do EPERM the caller and if not call simple_rename().
      Suggested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e98e6fa1
    • Christian Brauner's avatar
      binderfs: remove outdated comment · 7c4d08fc
      Christian Brauner authored
      The comment stems from an early version of that patchset and is just
      confusing now.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c4d08fc
    • Christian Brauner's avatar
      binderfs: use __u32 for device numbers · 7d017406
      Christian Brauner authored
      We allow more then 255 binderfs binder devices to be created since there
      are workloads that require more than that. If we use __u8 we'll overflow
      after 255. So let's use a __u32.
      Note that there's no released kernel with binderfs out there so this is
      not a regression.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d017406
    • Christian Brauner's avatar
      binderfs: use correct include guards in header · 6fc23b6e
      Christian Brauner authored
      When we switched over from binder_ctl.h to binderfs.h we forgot to change
      the include guards. It's minor but it's obviously correct.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fc23b6e
  2. 18 Jan, 2019 4 commits
  3. 13 Jan, 2019 3 commits
  4. 12 Jan, 2019 1 commit
  5. 11 Jan, 2019 4 commits
  6. 10 Jan, 2019 1 commit
    • Dexuan Cui's avatar
      vmbus: fix subchannel removal · b5679ceb
      Dexuan Cui authored
      The changes to split ring allocation from open/close, broke
      the cleanup of subchannels. This resulted in problems using
      uio on network devices because the subchannel was left behind
      when the network device was unbound.
      
      The cause was in the disconnect logic which used list splice
      to move the subchannel list into a local variable. This won't
      work because the subchannel list is needed later during the
      process of the rescind messages (relid2channel).
      
      The fix is to just leave the subchannel list in place
      which is what the original code did. The list is cleaned
      up later when the host rescind is processed.
      
      Without the fix, we have a lot of "hang" issues in netvsc when we
      try to change the NIC's MTU, set the number of channels, etc.
      
      Fixes: ae6935ed ("vmbus: split ring buffer allocation from open")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b5679ceb
  7. 09 Jan, 2019 2 commits
    • Vitaly Kuznetsov's avatar
      hv_balloon: avoid touching uninitialized struct page during tail onlining · da8ced36
      Vitaly Kuznetsov authored
      Hyper-V memory hotplug protocol has 2M granularity and in Linux x86 we use
      128M. To deal with it we implement partial section onlining by registering
      custom page onlining callback (hv_online_page()). Later, when more memory
      arrives we try to online the 'tail' (see hv_bring_pgs_online()).
      
      It was found that in some cases this 'tail' onlining causes issues:
      
       BUG: Bad page state in process kworker/0:2  pfn:109e3a
       page:ffffe08344278e80 count:0 mapcount:1 mapping:0000000000000000 index:0x0
       flags: 0xfffff80000000()
       raw: 000fffff80000000 dead000000000100 dead000000000200 0000000000000000
       raw: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
       page dumped because: nonzero mapcount
       ...
       Workqueue: events hot_add_req [hv_balloon]
       Call Trace:
        dump_stack+0x5c/0x80
        bad_page.cold.112+0x7f/0xb2
        free_pcppages_bulk+0x4b8/0x690
        free_unref_page+0x54/0x70
        hv_page_online_one+0x5c/0x80 [hv_balloon]
        hot_add_req.cold.24+0x182/0x835 [hv_balloon]
        ...
      
      Turns out that we now have deferred struct page initialization for memory
      hotplug so e.g. memory_block_action() in drivers/base/memory.c does
      pages_correctly_probed() check and in that check it avoids inspecting
      struct pages and checks sections instead. But in Hyper-V balloon driver we
      do PageReserved(pfn_to_page()) check and this is now wrong.
      
      Switch to checking online_section_nr() instead.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      da8ced36
    • Dexuan Cui's avatar
      Drivers: hv: vmbus: Check for ring when getting debug info · ba50bf1c
      Dexuan Cui authored
      fc96df16 is good and can already fix the "return stack garbage" issue,
      but let's also improve hv_ringbuffer_get_debuginfo(), which would silently
      return stack garbage, if people forget to check channel->state or
      ring_info->ring_buffer, when using the function in the future.
      
      Having an error check in the function would eliminate the potential risk.
      
      Add a Fixes tag to indicate the patch depdendency.
      
      Fixes: fc96df16 ("Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels")
      Cc: stable@vger.kernel.org
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ba50bf1c
  8. 08 Jan, 2019 2 commits
    • Christian Brauner's avatar
      binderfs: make each binderfs mount a new instance · b6c770d7
      Christian Brauner authored
      When currently mounting binderfs in the same ipc namespace twice:
      
      mount -t binder binder /A
      mount -t binder binder /B
      
      then the binderfs instances mounted on /A and /B will be the same, i.e.
      they will have the same superblock. This was the first approach that seemed
      reasonable. However, this leads to some problems and inconsistencies:
      
      /* private binderfs instance in same ipc namespace */
      There is no way for a user to request a private binderfs instance in the
      same ipc namespace.
      This request has been made in a private mail to me by two independent
      people.
      
      /* bind-mounts */
      If users want the same binderfs instance to appear in multiple places they
      can use bind mounts. So there is no value in having a request for a new
      binderfs mount giving them the same instance.
      
      /* unexpected behavior */
      It's surprising that request to mount binderfs is not giving the user a new
      instance like tmpfs, devpts, ramfs, and others do.
      
      /* past mistakes */
      Other pseudo-filesystems once made the same mistakes of giving back the
      same superblock when actually requesting a new mount (cf. devpts's
      deprecated "newinstance" option).
      We should not make the same mistake. Once we've committed to always giving
      back the same superblock in the same IPC namespace with the next kernel
      release we will not be able to make that change so better to do it now.
      
      /* kdbusfs */
      It was pointed out to me that kdbusfs - which is conceptually closely
      related to binderfs - also allowed users to get a private kdbusfs instance
      in the same IPC namespace by making each mount of kdbusfs a separate
      instance. I think that makes a lot of sense.
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6c770d7
    • Christian Brauner's avatar
      binderfs: remove wrong kern_mount() call · 3fdd94ac
      Christian Brauner authored
      The binderfs filesystem never needs to be mounted by the kernel itself.
      This is conceptually wrong and should never have been done in the first
      place.
      
      Fixes: 3ad20fe3 ("binder: implement binderfs")
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3fdd94ac
  9. 07 Jan, 2019 3 commits
    • Linus Torvalds's avatar
      Linux 5.0-rc1 · bfeffd15
      Linus Torvalds authored
      bfeffd15
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 85e1ffbd
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - improve boolinit.cocci and use_after_iter.cocci semantic patches
      
       - fix alignment for kallsyms
      
       - move 'asm goto' compiler test to Kconfig and clean up jump_label
         CONFIG option
      
       - generate asm-generic wrappers automatically if arch does not
         implement mandatory UAPI headers
      
       - remove redundant generic-y defines
      
       - misc cleanups
      
      * tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kconfig: rename generated .*conf-cfg to *conf-cfg
        kbuild: remove unnecessary stubs for archheader and archscripts
        kbuild: use assignment instead of define ... endef for filechk_* rules
        arch: remove redundant UAPI generic-y defines
        kbuild: generate asm-generic wrappers if mandatory headers are missing
        arch: remove stale comments "UAPI Header export list"
        riscv: remove redundant kernel-space generic-y
        kbuild: change filechk to surround the given command with { }
        kbuild: remove redundant target cleaning on failure
        kbuild: clean up rule_dtc_dt_yaml
        kbuild: remove UIMAGE_IN and UIMAGE_OUT
        jump_label: move 'asm goto' support test to Kconfig
        kallsyms: lower alignment on ARM
        scripts: coccinelle: boolinit: drop warnings on named constants
        scripts: coccinelle: check for redeclaration
        kconfig: remove unused "file" field of yylval union
        nds32: remove redundant kernel-space generic-y
        nios2: remove unneeded HAS_DMA define
      85e1ffbd
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ac5eed2b
      Linus Torvalds authored
      Pull perf tooling updates form Ingo Molnar:
       "A final batch of perf tooling changes: mostly fixes and small
        improvements"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
        perf session: Add comment for perf_session__register_idle_thread()
        perf thread-stack: Fix thread stack processing for the idle task
        perf thread-stack: Allocate an array of thread stacks
        perf thread-stack: Factor out thread_stack__init()
        perf thread-stack: Allow for a thread stack array
        perf thread-stack: Avoid direct reference to the thread's stack
        perf thread-stack: Tidy thread_stack__bottom() usage
        perf thread-stack: Simplify some code in thread_stack__process()
        tools gpio: Allow overriding CFLAGS
        tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command
        tools thermal tmon: Allow overriding CFLAGS assignments
        tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command
        perf c2c: Increase the HITM ratio limit for displayed cachelines
        perf c2c: Change the default coalesce setup
        perf trace beauty ioctl: Beautify USBDEVFS_ commands
        perf trace beauty: Export function to get the files for a thread
        perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
        perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
        tools headers uapi: Grab a copy of usbdevice_fs.h
        perf trace: Store the major number for a file when storing its pathname
        ...
      ac5eed2b
  10. 06 Jan, 2019 11 commits
    • Linus Torvalds's avatar
      Change mincore() to count "mapped" pages rather than "cached" pages · 574823bf
      Linus Torvalds authored
      The semantics of what "in core" means for the mincore() system call are
      somewhat unclear, but Linux has always (since 2.3.52, which is when
      mincore() was initially done) treated it as "page is available in page
      cache" rather than "page is mapped in the mapping".
      
      The problem with that traditional semantic is that it exposes a lot of
      system cache state that it really probably shouldn't, and that users
      shouldn't really even care about.
      
      So let's try to avoid that information leak by simply changing the
      semantics to be that mincore() counts actual mapped pages, not pages
      that might be cheaply mapped if they were faulted (note the "might be"
      part of the old semantics: being in the cache doesn't actually guarantee
      that you can access them without IO anyway, since things like network
      filesystems may have to revalidate the cache before use).
      
      In many ways the old semantics were somewhat insane even aside from the
      information leak issue.  From the very beginning (and that beginning is
      a long time ago: 2.3.52 was released in March 2000, I think), the code
      had a comment saying
      
        Later we can get more picky about what "in core" means precisely.
      
      and this is that "later".  Admittedly it is much later than is really
      comfortable.
      
      NOTE! This is a real semantic change, and it is for example known to
      change the output of "fincore", since that program literally does a
      mmmap without populating it, and then doing "mincore()" on that mapping
      that doesn't actually have any pages in it.
      
      I'm hoping that nobody actually has any workflow that cares, and the
      info leak is real.
      
      We may have to do something different if it turns out that people have
      valid reasons to want the old semantics, and if we can limit the
      information leak sanely.
      
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Masatake YAMATO <yamato@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      574823bf
    • Linus Torvalds's avatar
      Fix 'acccess_ok()' on alpha and SH · 94bd8a05
      Linus Torvalds authored
      Commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'")
      broke both alpha and SH booting in qemu, as noticed by Guenter Roeck.
      
      It turns out that the bug wasn't actually in that commit itself (which
      would have been surprising: it was mostly a no-op), but in how the
      addition of access_ok() to the strncpy_from_user() and strnlen_user()
      functions now triggered the case where those functions would test the
      access of the very last byte of the user address space.
      
      The string functions actually did that user range test before too, but
      they did it manually by just comparing against user_addr_max().  But
      with user_access_begin() doing the check (using "access_ok()"), it now
      exposed problems in the architecture implementations of that function.
      
      For example, on alpha, the access_ok() helper macro looked like this:
      
        #define __access_ok(addr, size) \
              ((get_fs().seg & (addr | size | (addr+size))) == 0)
      
      and what it basically tests is of any of the high bits get set (the
      USER_DS masking value is 0xfffffc0000000000).
      
      And that's completely wrong for the "addr+size" check.  Because it's
      off-by-one for the case where we check to the very end of the user
      address space, which is exactly what the strn*_user() functions do.
      
      Why? Because "addr+size" will be exactly the size of the address space,
      so trying to access the last byte of the user address space will fail
      the __access_ok() check, even though it shouldn't.  As a result, the
      user string accessor functions failed consistently - because they
      literally don't know how long the string is going to be, and the max
      access is going to be that last byte of the user address space.
      
      Side note: that alpha macro is buggy for another reason too - it re-uses
      the arguments twice.
      
      And SH has another version of almost the exact same bug:
      
        #define __addr_ok(addr) \
              ((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
      
      so far so good: yes, a user address must be below the limit.  But then:
      
        #define __access_ok(addr, size)         \
              (__addr_ok((addr) + (size)))
      
      is wrong with the exact same off-by-one case: the case when "addr+size"
      is exactly _equal_ to the limit is actually perfectly fine (think "one
      byte access at the last address of the user address space")
      
      The SH version is actually seriously buggy in another way: it doesn't
      actually check for overflow, even though it did copy the _comment_ that
      talks about overflow.
      
      So it turns out that both SH and alpha actually have completely buggy
      implementations of access_ok(), but they happened to work in practice
      (although the SH overflow one is a serious serious security bug, not
      that anybody likely cares about SH security).
      
      This fixes the problems by using a similar macro on both alpha and SH.
      It isn't trying to be clever, the end address is based on this logic:
      
              unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;
      
      which basically says "add start and length, and then subtract one unless
      the length was zero".  We can't subtract one for a zero length, or we'd
      just hit an underflow instead.
      
      For a lot of access_ok() users the length is a constant, so this isn't
      actually as expensive as it initially looks.
      Reported-and-tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94bd8a05
    • Linus Torvalds's avatar
      Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt · baa67073
      Linus Torvalds authored
      Pull fscrypt updates from Ted Ts'o:
       "Add Adiantum support for fscrypt"
      
      * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
        fscrypt: add Adiantum support
      baa67073
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 21524046
      Linus Torvalds authored
      Pull ext4 bug fixes from Ted Ts'o:
       "Fix a number of ext4 bugs"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix special inode number checks in __ext4_iget()
        ext4: track writeback errors using the generic tracking infrastructure
        ext4: use ext4_write_inode() when fsyncing w/o a journal
        ext4: avoid kernel warning when writing the superblock to a dead device
        ext4: fix a potential fiemap/page fault deadlock w/ inline_data
        ext4: make sure enough credits are reserved for dioread_nolock writes
      21524046
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mapping · e2b745f4
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
       "Fix various regressions introduced in this cycles:
      
         - fix dma-debug tracking for the map_page / map_single
           consolidatation
      
         - properly stub out DMA mapping symbols for !HAS_DMA builds to avoid
           link failures
      
         - fix AMD Gart direct mappings
      
         - setup the dma address for no kernel mappings using the remap
           allocator"
      
      * tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mapping:
        dma-direct: fix DMA_ATTR_NO_KERNEL_MAPPING for remapped allocations
        x86/amd_gart: fix unmapping of non-GART mappings
        dma-mapping: remove a few unused exports
        dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA
        dma-mapping: remove dmam_{declare,release}_coherent_memory
        dma-mapping: implement dmam_alloc_coherent using dmam_alloc_attrs
        dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs
      e2b745f4
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-for-v4.21' of... · 12133258
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
      
      Pull chrome platform updates from Benson Leung:
      
       - Changes for EC_MKBP_EVENT_SENSOR_FIFO handling.
      
       - Also, maintainership changes. Olofj out, Enric balletbo in.
      
      * tag 'tag-chrome-platform-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
        MAINTAINERS: add maintainers for ChromeOS EC sub-drivers
        MAINTAINERS: platform/chrome: Add Enric as a maintainer
        MAINTAINERS: platform/chrome: remove myself as maintainer
        platform/chrome: don't report EC_MKBP_EVENT_SENSOR_FIFO as wakeup
        platform/chrome: straighten out cros_ec_get_{next,host}_event() error codes
      12133258
    • Linus Torvalds's avatar
      Merge tag 'hwlock-v4.21' of git://github.com/andersson/remoteproc · 66e012f6
      Linus Torvalds authored
      Pull hwspinlock updates from Bjorn Andersson:
       "This adds support for the hardware semaphores found in STM32MP1"
      
      * tag 'hwlock-v4.21' of git://github.com/andersson/remoteproc:
        hwspinlock: fix return value check in stm32_hwspinlock_probe()
        hwspinlock: add STM32 hwspinlock device
        dt-bindings: hwlock: Document STM32 hwspinlock bindings
      66e012f6
    • Eric Biggers's avatar
      fscrypt: add Adiantum support · 8094c3ce
      Eric Biggers authored
      Add support for the Adiantum encryption mode to fscrypt.  Adiantum is a
      tweakable, length-preserving encryption mode with security provably
      reducible to that of XChaCha12 and AES-256, subject to a security bound.
      It's also a true wide-block mode, unlike XTS.  See the paper
      "Adiantum: length-preserving encryption for entry-level processors"
      (https://eprint.iacr.org/2018/720.pdf) for more details.  Also see
      commit 059c2a4d ("crypto: adiantum - add Adiantum support").
      
      On sufficiently long messages, Adiantum's bottlenecks are XChaCha12 and
      the NH hash function.  These algorithms are fast even on processors
      without dedicated crypto instructions.  Adiantum makes it feasible to
      enable storage encryption on low-end mobile devices that lack AES
      instructions; currently such devices are unencrypted.  On ARM Cortex-A7,
      on 4096-byte messages Adiantum encryption is about 4 times faster than
      AES-256-XTS encryption; decryption is about 5 times faster.
      
      In fscrypt, Adiantum is suitable for encrypting both file contents and
      names.  With filenames, it fixes a known weakness: when two filenames in
      a directory share a common prefix of >= 16 bytes, with CTS-CBC their
      encrypted filenames share a common prefix too, leaking information.
      Adiantum does not have this problem.
      
      Since Adiantum also accepts long tweaks (IVs), it's also safe to use the
      master key directly for Adiantum encryption rather than deriving
      per-file keys, provided that the per-file nonce is included in the IVs
      and the master key isn't used for any other encryption mode.  This
      configuration saves memory and improves performance.  A new fscrypt
      policy flag is added to allow users to opt-in to this configuration.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      8094c3ce
    • Linus Torvalds's avatar
      Merge tag 'docs-5.0-fixes' of git://git.lwn.net/linux · b5aef86e
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A handful of late-arriving documentation fixes"
      
      * tag 'docs-5.0-fixes' of git://git.lwn.net/linux:
        doc: filesystems: fix bad references to nonexistent ext4.rst file
        Documentation/admin-guide: update URL of LKML information link
        Docs/kernel-api.rst: Remove blk-tag.c reference
      b5aef86e
    • Linus Torvalds's avatar
      Merge tag 'firewire-update' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 · 15b215e5
      Linus Torvalds authored
      Pull firewire fixlet from Stefan Richter:
       "Remove an explicit dependency in Kconfig which is implied by another
        dependency"
      
      * tag 'firewire-update' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
        firewire: Remove depends on HAS_DMA in case of platform dependency
      15b215e5
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190104' of git://git.kernel.dk/linux-block · d7252d0d
      Linus Torvalds authored
      Pull block updates and fixes from Jens Axboe:
      
       - Pulled in MD changes that Shaohua had queued up for 4.21.
      
         Unfortunately we lost Shaohua late 2018, I'm sending these in on his
         behalf.
      
       - In conjunction with the above, I added a CREDITS entry for Shaoua.
      
       - sunvdc queue restart fix (Ming)
      
      * tag 'for-linus-20190104' of git://git.kernel.dk/linux-block:
        Add CREDITS entry for Shaohua Li
        block: sunvdc: don't run hw queue synchronously from irq context
        md: fix raid10 hang issue caused by barrier
        raid10: refactor common wait code from regular read/write request
        md: remvoe redundant condition check
        lib/raid6: add option to skip algo benchmarking
        lib/raid6: sort algos in rough performance order
        lib/raid6: check for assembler SSSE3 support
        lib/raid6: avoid __attribute_const__ redefinition
        lib/raid6: add missing include for raid6test
        md: remove set but not used variable 'bi_rdev'
      d7252d0d