1. 30 Nov, 2019 1 commit
  2. 21 Nov, 2019 1 commit
  3. 19 Nov, 2019 1 commit
  4. 12 Nov, 2019 4 commits
  5. 11 Nov, 2019 1 commit
    • Al Viro's avatar
      race in exportfs_decode_fh() · 581ae686
      Al Viro authored
      On Sat, Nov 02, 2019 at 06:08:42PM +0000, Al Viro wrote:
      
      > It is converging to a reasonably small and understandable surface, actually,
      > most of that being in core pathname resolution.  Two big piles of nightmares
      > left to review - overlayfs and (somewhat surprisingly) setxattr call chains,
      > the latter due to IMA/EVM/LSM insanity...
      
      Oh, lovely - in exportfs_decode_fh() we have this:
                      err = exportfs_get_name(mnt, target_dir, nbuf, result);
                      if (!err) {
                              inode_lock(target_dir->d_inode);
                              nresult = lookup_one_len(nbuf, target_dir,
                                                       strlen(nbuf));
                              inode_unlock(target_dir->d_inode);
                              if (!IS_ERR(nresult)) {
                                      if (nresult->d_inode) {
                                              dput(result);
                                              result = nresult;
                                      } else
                                              dput(nresult);
                              }
                      }
      We have derived the parent from fhandle, we have a disconnected dentry for child,
      we go look for the name.  We even find it.  Now, we want to look it up.  And
      some bastard goes and unlinks it, just as we are trying to lock the parent.
      We do a lookup, and get a negative dentry.  Then we unlock the parent... and
      some other bastard does e.g. mkdir with the same name.  OK, nresult->d_inode
      is not NULL (anymore).  It has fuck-all to do with the original fhandle
      (different inumber, etc.) but we happily accept it.
      
      Even better, we have no barriers between our check and nresult becoming positive.
      IOW, having observed non-NULL ->d_inode doesn't give us enough - e.g. we might
      still see the old ->d_flags value, from back when ->d_inode used to be NULL.
      On something like alpha we also have no promises that we'll observe anything
      about the fields of nresult->d_inode, but ->d_flags alone is enough for fun.
      The callers can't e.g. expect d_is_reg() et.al. to match the reality.
      
      This is obviously bogus.  And the fix is obvious: check that nresult->d_inode is
      equal to result->d_inode before unlocking the parent.  Note that we'd *already* had
      the original result and all of its aliases rejected by the 'acceptable' predicate,
      so if nresult doesn't supply us a better alias, we are SOL.
      
      Does anyone see objections to the following patch?  Christoph, that seems to
      be your code; am I missing something subtle here?  AFAICS, that goes back to
      2007 or so...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      581ae686
  6. 08 Nov, 2019 5 commits
  7. 30 Oct, 2019 2 commits
    • Chuck Lever's avatar
      SUNRPC: Fix svcauth_gss_proxy_init() · 5866efa8
      Chuck Lever authored
      gss_read_proxy_verf() assumes things about the XDR buffer containing
      the RPC Call that are not true for buffers generated by
      svc_rdma_recv().
      
      RDMA's buffers look more like what the upper layer generates for
      sending: head is a kmalloc'd buffer; it does not point to a page
      whose contents are contiguous with the first page in the buffers'
      page array. The result is that ACCEPT_SEC_CONTEXT via RPC/RDMA has
      stopped working on Linux NFS servers that use gssproxy.
      
      This does not affect clients that use only TCP to send their
      ACCEPT_SEC_CONTEXT operation (that's all Linux clients). Other
      clients, like Solaris NFS clients, send ACCEPT_SEC_CONTEXT on the
      same transport as they send all other NFS operations. Such clients
      can send ACCEPT_SEC_CONTEXT via RPC/RDMA.
      
      I thought I had found every direct reference in the server RPC code
      to the rqstp->rq_pages field.
      
      Bug found at the 2019 Westford NFS bake-a-thon.
      
      Fixes: 3316f063 ("svcrdma: Persistently allocate and DMA- ... ")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Tested-by: default avatarBill Baker <bill.baker@oracle.com>
      Reviewed-by: default avatarSimo Sorce <simo@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      5866efa8
    • Chuck Lever's avatar
      SUNRPC: Trace gssproxy upcall results · ff27e9f7
      Chuck Lever authored
      Record results of a GSS proxy ACCEPT_SEC_CONTEXT upcall and the
      svc_authenticate() function to make field debugging of NFS server
      Kerberos issues easier.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarBill Baker <bill.baker@oracle.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      ff27e9f7
  8. 11 Oct, 2019 2 commits
    • Pavel Tikhomirov's avatar
      sunrpc: fix crash when cache_head become valid before update · 5fcaf698
      Pavel Tikhomirov authored
      I was investigating a crash in our Virtuozzo7 kernel which happened in
      in svcauth_unix_set_client. I found out that we access m_client field
      in ip_map structure, which was received from sunrpc_cache_lookup (we
      have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
      these field looks uninitialized (m_client == 0x74 don't look like a
      pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.
      
      It looks like the problem appeared from our previous fix to sunrpc (1):
      commit 4ecd55ea ("sunrpc: fix cache_head leak due to queued
      request")
      
      And we've also found a patch already fixing our patch (2):
      commit d58431ea ("sunrpc: don't mark uninitialised items as VALID.")
      
      Though the crash is eliminated, I think the core of the problem is not
      completely fixed:
      
      Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
      cache_fresh_locked which was added in (1) to fix crash. These way
      cache_is_valid won't say the cache is valid anymore and in
      svcauth_unix_set_client the function cache_check will return error
      instead of 0, and we don't count entry as initialized.
      
      But it looks like we need to remove cache_fresh_locked completely in
      sunrpc_cache_lookup:
      
      In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
      that cache_requests with no readers also release corresponding
      cache_head, to fix their leak.  We with Vasily were not sure if
      cache_fresh_locked and cache_fresh_unlocked should be used in pair or
      not, so we've guessed to use them in pair.
      
      Now we see that we don't want the CACHE_VALID bit set here by
      cache_fresh_locked, as "valid" means "initialized" and there is no
      initialization in sunrpc_cache_add_entry. Both expiry_time and
      last_refresh are not used in cache_fresh_unlocked code-path and also not
      required for the initial fix.
      
      So to conclude cache_fresh_locked was called by mistake, and we can just
      safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
      ideologically better for me. Hope I don't miss something here.
      
      Here is our crash backtrace:
      [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 0000000000000074
      [13108726.326365] IP: [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.326448] PGD 0
      [13108726.326468] Oops: 0002 [#1] SMP
      [13108726.326497] Modules linked in: nbd isofs xfs loop kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4
      [13108726.327173]  dns_resolver cls_u32 binfmt_misc arptable_filter arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet ip6_vznetstat ip_vznetstat
      [13108726.327817]  ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kpatch_cumulative_82_0_r1]
      [13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded Tainted: G        W  O   ------------   3.10.0-862.20.2.vz7.73.29 #1 73.29
      [13108726.328491] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/02/2018
      [13108726.328554] task: ffffa0a6a41b1160 ti: ffffa0c2a74bc000 task.ti: ffffa0c2a74bc000
      [13108726.328610] RIP: 0010:[<ffffffffc01f79eb>]  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.328706] RSP: 0018:ffffa0c2a74bfd80  EFLAGS: 00010246
      [13108726.328750] RAX: 0000000000000001 RBX: ffffa0a6183ae000 RCX: 0000000000000000
      [13108726.328811] RDX: 0000000000000074 RSI: 0000000000000286 RDI: ffffa0c2a74bfcf0
      [13108726.328864] RBP: ffffa0c2a74bfe00 R08: ffffa0bab8c22960 R09: 0000000000000001
      [13108726.328916] R10: 0000000000000001 R11: 0000000000000001 R12: ffffa0a32aa7f000
      [13108726.328969] R13: ffffa0a6183afac0 R14: ffffa0c233d88d00 R15: ffffa0c2a74bfdb4
      [13108726.329022] FS:  0000000000000000(0000) GS:ffffa0e17f9c0000(0000) knlGS:0000000000000000
      [13108726.329081] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13108726.332311] CR2: 0000000000000074 CR3: 00000026a1b28000 CR4: 00000000007607e0
      [13108726.334606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [13108726.336754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [13108726.338908] PKRU: 00000000
      [13108726.341047] Call Trace:
      [13108726.343074]  [<ffffffff8a2c78b4>] ? groups_alloc+0x34/0x110
      [13108726.344837]  [<ffffffffc01f5eb4>] svc_set_client+0x24/0x30 [sunrpc]
      [13108726.346631]  [<ffffffffc01f2ac1>] svc_process_common+0x241/0x710 [sunrpc]
      [13108726.348332]  [<ffffffffc01f3093>] svc_process+0x103/0x190 [sunrpc]
      [13108726.350016]  [<ffffffffc07d605f>] nfsd+0xdf/0x150 [nfsd]
      [13108726.351735]  [<ffffffffc07d5f80>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [13108726.353459]  [<ffffffff8a2bf741>] kthread+0xd1/0xe0
      [13108726.355195]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
      [13108726.356896]  [<ffffffff8a9556dd>] ret_from_fork_nospec_begin+0x7/0x21
      [13108726.358577]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
      [13108726.360240] Code: 4c 8b 45 98 0f 8e 2e 01 00 00 83 f8 fe 0f 84 76 fe ff ff 85 c0 0f 85 2b 01 00 00 49 8b 50 40 b8 01 00 00 00 48 89 93 d0 1a 00 00 <f0> 0f c1 02 83 c0 01 83 f8 01 0f 8e 53 02 00 00 49 8b 44 24 38
      [13108726.363769] RIP  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.365530]  RSP <ffffa0c2a74bfd80>
      [13108726.367179] CR2: 0000000000000074
      
      Fixes: d58431ea ("sunrpc: don't mark uninitialised items as VALID.")
      Signed-off-by: default avatarPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Acked-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      5fcaf698
    • Andy Shevchenko's avatar
      nfsd: remove private bin2hex implementation · 12b4157b
      Andy Shevchenko authored
      Calling sprintf in a loop is not very efficient, and in any case,
      we already have an implementation of bin-to-hex conversion in lib/
      which we might as well use.
      
      Note that original code used to nul-terminate the destination while
      bin2hex doesn't. That's why replace kmalloc() with kzalloc().
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      12b4157b
  9. 09 Oct, 2019 1 commit
    • Scott Mayhew's avatar
      nfsd4: fix up replay_matches_cache() · 6e73e92b
      Scott Mayhew authored
      When running an nfs stress test, I see quite a few cached replies that
      don't match up with the actual request.  The first comment in
      replay_matches_cache() makes sense, but the code doesn't seem to
      match... fix it.
      
      This isn't exactly a bugfix, as the server isn't required to catch every
      case of a false retry.  So, we may as well do this, but if this is
      fixing a problem then that suggests there's a client bug.
      
      Fixes: 53da6a53 ("nfsd4: catch some false session retries")
      Signed-off-by: default avatarScott Mayhew <smayhew@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      6e73e92b
  10. 08 Oct, 2019 3 commits
  11. 06 Oct, 2019 4 commits
    • Linus Torvalds's avatar
      Linux 5.4-rc2 · da0c9ea1
      Linus Torvalds authored
      da0c9ea1
    • Linus Torvalds's avatar
      elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings · b212921b
      Linus Torvalds authored
      In commit 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") we
      changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the
      executable mappings.
      
      Then, people reported that it broke some binaries that had overlapping
      segments from the same file, and commit ad55eac7 ("elf: enforce
      MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some
      overlaying elf segment cases.  But only some - despite the summary line
      of that commit, it only did it when it also does a temporary brk vma for
      one obvious overlapping case.
      
      Now Russell King reports another overlapping case with old 32-bit x86
      binaries, which doesn't trigger that limited case.  End result: we had
      better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED.
      
      Yes, it's a sign of old binaries generated with old tool-chains, but we
      do pride ourselves on not breaking existing setups.
      
      This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp()
      and the old load_elf_library() use-cases, because nobody has reported
      breakage for those. Yet.
      
      Note that in all the cases seen so far, the overlapping elf sections
      seem to be just re-mapping of the same executable with different section
      attributes.  We could possibly introduce a new MAP_FIXED_NOFILECHANGE
      flag or similar, which acts like NOREPLACE, but allows just remapping
      the same executable file using different protection flags.
      
      It's not clear that would make a huge difference to anything, but if
      people really hate that "elf remaps over previous maps" behavior, maybe
      at least a more limited form of remapping would alleviate some concerns.
      
      Alternatively, we should take a look at our elf_map() logic to see if we
      end up not mapping things properly the first time.
      
      In the meantime, this is the minimal "don't do that then" patch while
      people hopefully think about it more.
      Reported-by: default avatarRussell King <linux@armlinux.org.uk>
      Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map")
      Fixes: ad55eac7 ("elf: enforce  MAP_FIXED on overlaying elf segments")
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b212921b
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping · 7cdb85df
      Linus Torvalds authored
      Pull dma-mapping regression fix from Christoph Hellwig:
       "Revert an incorret hunk from a patch that caused problems on various
        arm boards (Andrey Smirnov)"
      
      * tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: fix false positive warnings in dma_common_free_remap()
      7cdb85df
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 43b815c6
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A few fixes this time around:
      
         - Fixup of some clock specifications for DRA7 (device-tree fix)
      
         - Removal of some dead/legacy CPU OPP/PM code for OMAP that throws
           warnings at boot
      
         - A few more minor fixups for OMAPs, most around display
      
         - Enable STM32 QSPI as =y since their rootfs sometimes comes from
           there
      
         - Switch CONFIG_REMOTEPROC to =y since it went from tristate to bool
      
         - Fix of thermal zone definition for ux500 (5.4 regression)"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        ARM: multi_v7_defconfig: Fix SPI_STM32_QSPI support
        ARM: dts: ux500: Fix up the CPU thermal zone
        arm64/ARM: configs: Change CONFIG_REMOTEPROC from m to y
        ARM: dts: am4372: Set memory bandwidth limit for DISPC
        ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage()
        ARM: OMAP2+: Add missing LCDC midlemode for am335x
        ARM: OMAP2+: Fix missing reset done flag for am3 and am43
        ARM: dts: Fix gpio0 flags for am335x-icev2
        ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules
        ARM: omap2plus_defconfig: Enable DRM_TI_TFP410
        DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again
        ARM: dts: Fix wrong clocks for dra7 mcasp
        clk: ti: dra7: Fix mcasp8 clock bits
      43b815c6
  12. 05 Oct, 2019 15 commits
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.4' of... · 2d00aee2
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - remove unneeded ar-option and KBUILD_ARFLAGS
      
       - remove long-deprecated SUBDIRS
      
       - fix modpost to suppress false-positive warnings for UML builds
      
       - fix namespace.pl to handle relative paths to ${objtree}, ${srctree}
      
       - make setlocalversion work for /bin/sh
      
       - make header archive reproducible
      
       - fix some Makefiles and documents
      
      * tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kheaders: make headers archive reproducible
        kbuild: update compile-test header list for v5.4-rc2
        kbuild: two minor updates for Documentation/kbuild/modules.rst
        scripts/setlocalversion: clear local variable to make it work for sh
        namespace: fix namespace.pl script to support relative paths
        video/logo: do not generate unneeded logo C files
        video/logo: remove unneeded *.o pattern from clean-files
        integrity: remove pointless subdir-$(CONFIG_...)
        integrity: remove unneeded, broken attempt to add -fshort-wchar
        modpost: fix static EXPORT_SYMBOL warnings for UML build
        kbuild: correct formatting of header in kbuild module docs
        kbuild: remove SUBDIRS support
        kbuild: remove ar-option and KBUILD_ARFLAGS
      2d00aee2
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 126195c9
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Twelve patches mostly small but obvious fixes or cosmetic but small
        updates"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla2xxx: Fix Nport ID display value
        scsi: qla2xxx: Fix N2N link up fail
        scsi: qla2xxx: Fix N2N link reset
        scsi: qla2xxx: Optimize NPIV tear down process
        scsi: qla2xxx: Fix stale mem access on driver unload
        scsi: qla2xxx: Fix unbound sleep in fcport delete path.
        scsi: qla2xxx: Silence fwdump template message
        scsi: hisi_sas: Make three functions static
        scsi: megaraid: disable device when probe failed after enabled device
        scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue
        scsi: qedf: Remove always false 'tmp_prio < 0' statement
        scsi: ufs: skip shutdown if hba is not powered
        scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF
      126195c9
    • Linus Torvalds's avatar
      Merge branch 'readdir' (readdir speedup and sanity checking) · 4f11918a
      Linus Torvalds authored
      This makes getdents() and getdents64() do sanity checking on the
      pathname that it gives to user space.  And to mitigate the performance
      impact of that, it first cleans up the way it does the user copying, so
      that the code avoids doing the SMAP/PAN updates between each part of the
      dirent structure write.
      
      I really wanted to do this during the merge window, but didn't have
      time.  The conversion of filldir to unsafe_put_user() is something I've
      had around for years now in a private branch, but the extra pathname
      checking finally made me clean it up to the point where it is mergable.
      
      It's worth noting that the filename validity checking really should be a
      bit smarter: it would be much better to delay the error reporting until
      the end of the readdir, so that non-corrupted filenames are still
      returned.  But that involves bigger changes, so let's see if anybody
      actually hits the corrupt directory entry case before worrying about it
      further.
      
      * branch 'readdir':
        Make filldir[64]() verify the directory entry filename is valid
        Convert filldir[64]() from __put_user() to unsafe_put_user()
      4f11918a
    • Linus Torvalds's avatar
      Make filldir[64]() verify the directory entry filename is valid · 8a23eb80
      Linus Torvalds authored
      This has been discussed several times, and now filesystem people are
      talking about doing it individually at the filesystem layer, so head
      that off at the pass and just do it in getdents{64}().
      
      This is partially based on a patch by Jann Horn, but checks for NUL
      bytes as well, and somewhat simplified.
      
      There's also commentary about how it might be better if invalid names
      due to filesystem corruption don't cause an immediate failure, but only
      an error at the end of the readdir(), so that people can still see the
      filenames that are ok.
      
      There's also been discussion about just how much POSIX strictly speaking
      requires this since it's about filesystem corruption.  It's really more
      "protect user space from bad behavior" as pointed out by Jann.  But
      since Eric Biederman looked up the POSIX wording, here it is for context:
      
       "From readdir:
      
         The readdir() function shall return a pointer to a structure
         representing the directory entry at the current position in the
         directory stream specified by the argument dirp, and position the
         directory stream at the next entry. It shall return a null pointer
         upon reaching the end of the directory stream. The structure dirent
         defined in the <dirent.h> header describes a directory entry.
      
        From definitions:
      
         3.129 Directory Entry (or Link)
      
         An object that associates a filename with a file. Several directory
         entries can associate names with the same file.
      
        ...
      
         3.169 Filename
      
         A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
         characters composing the name may be selected from the set of all
         character values excluding the slash character and the null byte. The
         filenames dot and dot-dot have special meaning. A filename is
         sometimes referred to as a 'pathname component'."
      
      Note that I didn't bother adding the checks to any legacy interfaces
      that nobody uses.
      
      Also note that if this ends up being noticeable as a performance
      regression, we can fix that to do a much more optimized model that
      checks for both NUL and '/' at the same time one word at a time.
      
      We haven't really tended to optimize 'memchr()', and it only checks for
      one pattern at a time anyway, and we really _should_ check for NUL too
      (but see the comment about "soft errors" in the code about why it
      currently only checks for '/')
      
      See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
      lookup code looks for pathname terminating characters in parallel.
      
      Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Jann Horn <jannh@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a23eb80
    • Linus Torvalds's avatar
      Convert filldir[64]() from __put_user() to unsafe_put_user() · 9f79b78e
      Linus Torvalds authored
      We really should avoid the "__{get,put}_user()" functions entirely,
      because they can easily be mis-used and the original intent of being
      used for simple direct user accesses no longer holds in a post-SMAP/PAN
      world.
      
      Manually optimizing away the user access range check makes no sense any
      more, when the range check is generally much cheaper than the "enable
      user accesses" code that the __{get,put}_user() functions still need.
      
      So instead of __put_user(), use the unsafe_put_user() interface with
      user_access_{begin,end}() that really does generate better code these
      days, and which is generally a nicer interface.  Under some loads, the
      multiple user writes that filldir() does are actually quite noticeable.
      
      This also makes the dirent name copy use unsafe_put_user() with a couple
      of macros.  We do not want to make function calls with SMAP/PAN
      disabled, and the code this generates is quite good when the
      architecture uses "asm goto" for unsafe_put_user() like x86 does.
      
      Note that this doesn't bother with the legacy cases.  Nobody should use
      them anyway, so performance doesn't really matter there.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f79b78e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 9819a30c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold.
      
       2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric
          Dumazet.
      
       3) txq null deref in mac80211, from Miaoqing Pan.
      
       4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann.
      
       5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso.
      
       6) Avoid division by zero in taprio scheduler, from Vladimir Oltean.
      
       7) Various xgmac fixes in stmmac driver from Jose Abreu.
      
       8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from
          Michal Kubecek.
      
       9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij.
      
      10) Fix sleep while atomic in sja1105, from Vladimir Oltean.
      
      11) Suspend/resume deadlock in stmmac, from Thierry Reding.
      
      12) Various UDP GSO fixes from Josh Hunt.
      
      13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric
          Dumazet.
      
      14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern.
      
      15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
        selftests/net: add nettest to .gitignore
        net: qlogic: Fix memory leak in ql_alloc_large_buffers
        nfc: fix memory leak in llcp_sock_bind()
        sch_dsmark: fix potential NULL deref in dsmark_init()
        net: phy: at803x: use operating parameters from PHY-specific status
        net: phy: extract pause mode
        net: phy: extract link partner advertisement reading
        net: phy: fix write to mii-ctrl1000 register
        ipv6: Handle missing host route in __ipv6_ifa_notify
        net: phy: allow for reset line to be tied to a sleepy GPIO controller
        net: ipv4: avoid mixed n_redirects and rate_tokens usage
        r8152: Set macpassthru in reset_resume callback
        cxgb4:Fix out-of-bounds MSI-X info array access
        Revert "ipv6: Handle race in addrconf_dad_work"
        net: make sock_prot_memory_pressure() return "const char *"
        rxrpc: Fix rxrpc_recvmsg tracepoint
        qmi_wwan: add support for Cinterion CLS8 devices
        tcp: fix slab-out-of-bounds in tcp_zerocopy_receive()
        lib: textsearch: fix escapes in example code
        udp: only do GSO if # of segs > 1
        ...
      9819a30c
    • Linus Torvalds's avatar
      Merge tag 's390-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 6fe137cb
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - defconfig updates
      
       - Fix build errors with CC_OPTIMIZE_FOR_SIZE due to usage of "i"
         constraint for function arguments. Two kvm changes acked-by Christian
         Borntraeger.
      
       - Fix -Wunused-but-set-variable warnings in mm code.
      
       - Avoid a constant misuse in qdio.
      
       - Handle a case when cpumf is temporarily unavailable.
      
      * tag 's390-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        KVM: s390: mark __insn32_query() as __always_inline
        KVM: s390: fix __insn32_query() inline assembly
        s390: update defconfigs
        s390/pci: mark function(s) __always_inline
        s390/mm: mark function(s) __always_inline
        s390/jump_label: mark function(s) __always_inline
        s390/cpu_mf: mark function(s) __always_inline
        s390/atomic,bitops: mark function(s) __always_inline
        s390/mm: fix -Wunused-but-set-variable warnings
        s390: mark __cpacf_query() as __always_inline
        s390/qdio: clarify size of the QIB parm area
        s390/cpumf: Fix indentation in sampling device driver
        s390/cpumsf: Check for CPU Measurement sampling
        s390/cpumf: Use consistant debug print format
      6fe137cb
    • Heiko Carstens's avatar
      KVM: s390: mark __insn32_query() as __always_inline · d0dea733
      Heiko Carstens authored
      __insn32_query() will not compile if the compiler decides to not
      inline it, since it contains an inline assembly with an "i" constraint
      with variable contents.
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      d0dea733
    • Heiko Carstens's avatar
      KVM: s390: fix __insn32_query() inline assembly · b1c41ac3
      Heiko Carstens authored
      The inline assembly constraints of __insn32_query() tell the compiler
      that only the first byte of "query" is being written to. Intended was
      probably that 32 bytes are written to.
      
      Fix and simplify the code and just use a "memory" clobber.
      
      Fixes: d6681397 ("KVM: s390: provide query function for instructions returning 32 byte")
      Cc: stable@vger.kernel.org # v5.2+
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      b1c41ac3
    • Andrey Smirnov's avatar
      dma-mapping: fix false positivse warnings in dma_common_free_remap() · 2cf2aa6a
      Andrey Smirnov authored
      Commit 5cf45379 ("dma-mapping: introduce a dma_common_find_pages
      helper") changed invalid input check in dma_common_free_remap() from:
      
          if (!area || !area->flags != VM_DMA_COHERENT)
      
      to
      
          if (!area || !area->flags != VM_DMA_COHERENT || !area->pages)
      
      which seem to produce false positives for memory obtained via
      dma_common_contiguous_remap()
      
      This triggers the following warning message when doing "reboot" on ZII
      VF610 Dev Board Rev B:
      
      WARNING: CPU: 0 PID: 1 at kernel/dma/remap.c:112 dma_common_free_remap+0x88/0x8c
      trying to free invalid coherent area: 9ef82980
      Modules linked in:
      CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.3.0-rc6-next-20190820 #119
      Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
      Backtrace:
      [<8010d1ec>] (dump_backtrace) from [<8010d588>] (show_stack+0x20/0x24)
       r7:8015ed78 r6:00000009 r5:00000000 r4:9f4d9b14
      [<8010d568>] (show_stack) from [<8077e3f0>] (dump_stack+0x24/0x28)
      [<8077e3cc>] (dump_stack) from [<801197a0>] (__warn.part.3+0xcc/0xe4)
      [<801196d4>] (__warn.part.3) from [<80119830>] (warn_slowpath_fmt+0x78/0x94)
       r6:00000070 r5:808e540c r4:81c03048
      [<801197bc>] (warn_slowpath_fmt) from [<8015ed78>] (dma_common_free_remap+0x88/0x8c)
       r3:9ef82980 r2:808e53e0
       r7:00001000 r6:a0b1e000 r5:a0b1e000 r4:00001000
      [<8015ecf0>] (dma_common_free_remap) from [<8010fa9c>] (remap_allocator_free+0x60/0x68)
       r5:81c03048 r4:9f4d9b78
      [<8010fa3c>] (remap_allocator_free) from [<801100d0>] (__arm_dma_free.constprop.3+0xf8/0x148)
       r5:81c03048 r4:9ef82900
      [<8010ffd8>] (__arm_dma_free.constprop.3) from [<80110144>] (arm_dma_free+0x24/0x2c)
       r5:9f563410 r4:80110120
      [<80110120>] (arm_dma_free) from [<8015d80c>] (dma_free_attrs+0xa0/0xdc)
      [<8015d76c>] (dma_free_attrs) from [<8020f3e4>] (dma_pool_destroy+0xc0/0x154)
       r8:9efa8860 r7:808f02f0 r6:808f02d0 r5:9ef82880 r4:9ef82780
      [<8020f324>] (dma_pool_destroy) from [<805525d0>] (ehci_mem_cleanup+0x6c/0x150)
       r7:9f563410 r6:9efa8810 r5:00000000 r4:9efd0148
      [<80552564>] (ehci_mem_cleanup) from [<80558e0c>] (ehci_stop+0xac/0xc0)
       r5:9efd0148 r4:9efd0000
      [<80558d60>] (ehci_stop) from [<8053c4bc>] (usb_remove_hcd+0xf4/0x1b0)
       r7:9f563410 r6:9efd0074 r5:81c03048 r4:9efd0000
      [<8053c3c8>] (usb_remove_hcd) from [<8056361c>] (host_stop+0x48/0xb8)
       r7:9f563410 r6:9efd0000 r5:9f5f4040 r4:9f5f5040
      [<805635d4>] (host_stop) from [<80563d0c>] (ci_hdrc_host_destroy+0x34/0x38)
       r7:9f563410 r6:9f5f5040 r5:9efa8800 r4:9f5f4040
      [<80563cd8>] (ci_hdrc_host_destroy) from [<8055ef18>] (ci_hdrc_remove+0x50/0x10c)
      [<8055eec8>] (ci_hdrc_remove) from [<804a2ed8>] (platform_drv_remove+0x34/0x4c)
       r7:9f563410 r6:81c4f99c r5:9efa8810 r4:9efa8810
      [<804a2ea4>] (platform_drv_remove) from [<804a18a8>] (device_release_driver_internal+0xec/0x19c)
       r5:00000000 r4:9efa8810
      [<804a17bc>] (device_release_driver_internal) from [<804a1978>] (device_release_driver+0x20/0x24)
       r7:9f563410 r6:81c41ed0 r5:9efa8810 r4:9f4a1dac
      [<804a1958>] (device_release_driver) from [<804a01b8>] (bus_remove_device+0xdc/0x108)
      [<804a00dc>] (bus_remove_device) from [<8049c204>] (device_del+0x150/0x36c)
       r7:9f563410 r6:81c03048 r5:9efa8854 r4:9efa8810
      [<8049c0b4>] (device_del) from [<804a3368>] (platform_device_del.part.2+0x20/0x84)
       r10:9f563414 r9:809177e0 r8:81cb07dc r7:81c78320 r6:9f563454 r5:9efa8800
       r4:9efa8800
      [<804a3348>] (platform_device_del.part.2) from [<804a3420>] (platform_device_unregister+0x28/0x34)
       r5:9f563400 r4:9efa8800
      [<804a33f8>] (platform_device_unregister) from [<8055dce0>] (ci_hdrc_remove_device+0x1c/0x30)
       r5:9f563400 r4:00000001
      [<8055dcc4>] (ci_hdrc_remove_device) from [<805652ac>] (ci_hdrc_imx_remove+0x38/0x118)
       r7:81c78320 r6:9f563454 r5:9f563410 r4:9f541010
      [<8056538c>] (ci_hdrc_imx_shutdown) from [<804a2970>] (platform_drv_shutdown+0x2c/0x30)
      [<804a2944>] (platform_drv_shutdown) from [<8049e4fc>] (device_shutdown+0x158/0x1f0)
      [<8049e3a4>] (device_shutdown) from [<8013ac80>] (kernel_restart_prepare+0x44/0x48)
       r10:00000058 r9:9f4d8000 r8:fee1dead r7:379ce700 r6:81c0b280 r5:81c03048
       r4:00000000
      [<8013ac3c>] (kernel_restart_prepare) from [<8013ad14>] (kernel_restart+0x1c/0x60)
      [<8013acf8>] (kernel_restart) from [<8013af84>] (__do_sys_reboot+0xe0/0x1d8)
       r5:81c03048 r4:00000000
      [<8013aea4>] (__do_sys_reboot) from [<8013b0ec>] (sys_reboot+0x18/0x1c)
       r8:80101204 r7:00000058 r6:00000000 r5:00000000 r4:00000000
      [<8013b0d4>] (sys_reboot) from [<80101000>] (ret_fast_syscall+0x0/0x54)
      Exception stack(0x9f4d9fa8 to 0x9f4d9ff0)
      9fa0:                   00000000 00000000 fee1dead 28121969 01234567 379ce700
      9fc0: 00000000 00000000 00000000 00000058 00000000 00000000 00000000 00016d04
      9fe0: 00028e0c 7ec87c64 000135ec 76c1f410
      
      Restore original invalid input check in dma_common_free_remap() to
      avoid this problem.
      
      Fixes: 5cf45379 ("dma-mapping: introduce a dma_common_find_pages helper")
      Signed-off-by: default avatarAndrey Smirnov <andrew.smirnov@gmail.com>
      [hch: just revert the offending hunk instead of creating a new helper]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      2cf2aa6a
    • Dmitry Goldin's avatar
      kheaders: make headers archive reproducible · 86cdd2fd
      Dmitry Goldin authored
      In commit 43d8ce9d ("Provide in-kernel headers to make
      extending kernel easier") a new mechanism was introduced, for kernels
      >=5.2, which embeds the kernel headers in the kernel image or a module
      and exposes them in procfs for use by userland tools.
      
      The archive containing the header files has nondeterminism caused by
      header files metadata. This patch normalizes the metadata and utilizes
      KBUILD_BUILD_TIMESTAMP if provided and otherwise falls back to the
      default behaviour.
      
      In commit f7b101d3 ("kheaders: Move from proc to sysfs") it was
      modified to use sysfs and the script for generation of the archive was
      renamed to what is being patched.
      Signed-off-by: default avatarDmitry Goldin <dgoldin+lkml@protonmail.ch>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      86cdd2fd
    • Masahiro Yamada's avatar
      kbuild: update compile-test header list for v5.4-rc2 · d188b8c9
      Masahiro Yamada authored
      Commit 6dc280eb ("coda: remove uapi/linux/coda_psdev.h") removed
      a header in question. Some more build errors were fixed. Add more
      headers into the test coverage.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      d188b8c9
    • Masahiro Yamada's avatar
      kbuild: two minor updates for Documentation/kbuild/modules.rst · 43496709
      Masahiro Yamada authored
      Capitalize the first word in the sentence.
      
      Use obj-m instead of obj-y. obj-y still works, but we have no built-in
      objects in external module builds. So, obj-m is better IMHO.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      43496709
    • Masahiro Yamada's avatar
      scripts/setlocalversion: clear local variable to make it work for sh · 7a82e3fa
      Masahiro Yamada authored
      Geert Uytterhoeven reports a strange side-effect of commit 858805b3
      ("kbuild: add $(BASH) to run scripts with bash-extension"), which
      inserts the contents of a localversion file in the build directory twice.
      
      [Steps to Reproduce]
        $ echo bar > localversion
        $ mkdir build
        $ cd build/
        $ echo foo > localversion
        $ make -s -f ../Makefile defconfig include/config/kernel.release
        $ cat include/config/kernel.release
        5.4.0-rc1foofoobar
      
      This comes down to the behavior change of local variables.
      
      The 'man sh' on my Ubuntu machine, where sh is an alias to dash,
      explains as follows:
        When a variable is made local, it inherits the initial value and
        exported and readonly flags from the variable with the same name
        in the surrounding scope, if there is one. Otherwise, the variable
        is initially unset.
      
      [Test Code]
      
        foo ()
        {
                local res
                echo "res: $res"
        }
      
        res=1
        foo
      
      [Result]
      
        $ sh test.sh
        res: 1
        $ bash test.sh
        res:
      
      So, scripts/setlocalversion correctly works only for bash in spite of
      its hashbang being #!/bin/sh. Nobody had noticed it before because
      CONFIG_SHELL was previously set to bash almost all the time.
      
      Now that CONFIG_SHELL is set to sh, we must write portable and correct
      code. I gave the Fixes tag to the commit that uncovered the issue.
      
      Clear the variable 'res' in collect_files() to make it work for sh
      (and it also works on distributions where sh is an alias to bash).
      
      Fixes: 858805b3 ("kbuild: add $(BASH) to run scripts with bash-extension")
      Reported-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      7a82e3fa
    • Jacob Keller's avatar
      namespace: fix namespace.pl script to support relative paths · 82fdd12b
      Jacob Keller authored
      The namespace.pl script does not work properly if objtree is not set to
      an absolute path. The do_nm function is run from within the find
      function, which changes directories.
      
      Because of this, appending objtree, $File::Find::dir, and $source, will
      return a path which is not valid from the current directory.
      
      This used to work when objtree was set to an absolute path when using
      "make namespacecheck". It appears to have not worked when calling
      ./scripts/namespace.pl directly.
      
      This behavior was changed in 7e1c0477 ("kbuild: Use relative path
      for $(objtree)", 2014-05-14)
      
      Rather than fixing the Makefile to set objtree to an absolute path, just
      fix namespace.pl to work when srctree and objtree are relative. Also fix
      the script to use an absolute path for these by default.
      
      Use the File::Spec module for this purpose. It's been part of perl
      5 since 5.005.
      
      The curdir() function is used to get the current directory when the
      objtree and srctree aren't set in the environment.
      
      rel2abs() is used to convert possibly relative objtree and srctree
      environment variables to absolute paths.
      
      Finally, the catfile() function is used instead of string appending
      paths together, since this is more robust when joining paths together.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Tested-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      82fdd12b