1. 07 Apr, 2018 9 commits
    • Linus Torvalds's avatar
      Merge tag 'leaks-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tobin/leaks · 299f89d5
      Linus Torvalds authored
      Pull leaking-addresses updates from Tobin Harding:
       "This set represents improvements to the scripts/leaking_addresses.pl
        script.
      
        The major improvement is that with this set applied the script
        actually runs in a reasonable amount of time (less than a minute on a
        standard stock Ubuntu user desktop). Also, we have a second maintainer
        now and a tree hosted on kernel.org
      
        We do a few code clean ups. We fix the command help output. Handling
        of the vsyscall address range is fixed to check the whole range
        instead of just the start/end addresses. We add support for 5 page
        table levels (suggested on LKML). We use a system command to get the
        machine architecture instead of using Perl. Calling this command for
        every regex comparison is what previously choked the script, caching
        the result of this call gave the major speed improvement. We add
        support for scanning 32-bit kernels using the user/kernel memory
        split. Path skipping code refactored and simplified (meaning easier
        script configuration). We remove version numbering. We add a variable
        name to improve readability of a regex and finally we check filenames
        for leaking addresses.
      
        Currently script scans /proc/PID for all PID. With this set applied we
        only scan for PID==1. It was observed that on an idle system files
        under /proc/PID are predominantly the same for all processes. Also it
        was noted that the script does not scan _all_ the kernel since it only
        scans active processes. Scanning only for PID==1 makes explicit the
        inherent flaw in the script that the scan is only partial and also
        speeds things up"
      
      * tag 'leaks-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tobin/leaks:
        MAINTAINERS: Update LEAKING_ADDRESSES
        leaking_addresses: check if file name contains address
        leaking_addresses: explicitly name variable used in regex
        leaking_addresses: remove version number
        leaking_addresses: skip '/proc/1/syscall'
        leaking_addresses: skip all /proc/PID except /proc/1
        leaking_addresses: cache architecture name
        leaking_addresses: simplify path skipping
        leaking_addresses: do not parse binary files
        leaking_addresses: add 32-bit support
        leaking_addresses: add is_arch() wrapper subroutine
        leaking_addresses: use system command to get arch
        leaking_addresses: add support for 5 page table levels
        leaking_addresses: add support for kernel config file
        leaking_addresses: add range check for vsyscall memory
        leaking_addresses: indent dependant options
        leaking_addresses: remove command examples
        leaking_addresses: remove mention of kptr_restrict
        leaking_addresses: fix typo function not called
      299f89d5
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-4.17-rc1' of... · fc22e19a
      Linus Torvalds authored
      Merge tag 'linux-kselftest-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest update from Shuah Khan:
       "This Kselftest update for 4.17-rc1 consists of:
      
         - Test build error fixes
      
         - Fixes to prevent intel_pstate from building on non-x86 systems.
      
         - New test for ion with vgem driver.
      
         - Change to print the test name to /dev/kmsg to add context to kernel
           failures if any uncovered from running the test.
      
         - Kselftest framework enhancements to add KSFT_TAP_LEVEL environment
           variable to prevent nested TAP headers being printed in the
           Kselftest output.
      
           Nested TAP13 headers could cause problems for some parsers. This
           change suppresses the nested headers from test programs and test
           shell scripts with changes to framework and Makefiles without
           changing the tests"
      
      * tag 'linux-kselftest-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/intel_pstate: Fix build rule for x86
        selftests: Print the test we're running to /dev/kmsg
        selftests/seccomp: Allow get_metadata to XFAIL
        selftests/android/ion: Makefile: fix build error
        selftests: futex Makefile add top level TAP header echo to RUN_TESTS
        selftests: Makefile set KSFT_TAP_LEVEL to prevent nested TAP headers
        selftests: lib.mk set KSFT_TAP_LEVEL to prevent nested TAP headers
        selftests: kselftest framework: add handling for TAP header level
        selftests: ion: Add simple test with the vgem driver
        selftests: ion: Remove some prints
      fc22e19a
    • Linus Torvalds's avatar
      Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 3612605a
      Linus Torvalds authored
      Pull general security layer updates from James Morris:
      
       - Convert security hooks from list to hlist, a nice cleanup, saving
         about 50% of space, from Sargun Dhillon.
      
       - Only pass the cred, not the secid, to kill_pid_info_as_cred and
         security_task_kill (as the secid can be determined from the cred),
         from Stephen Smalley.
      
       - Close a potential race in kernel_read_file(), by making the file
         unwritable before calling the LSM check (vs after), from Kees Cook.
      
      * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        security: convert security hooks to use hlist
        exec: Set file unwritable before LSM check
        usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill
      3612605a
    • Linus Torvalds's avatar
      Merge tag 'fscache-next-20180406' of... · 62f8e6c5
      Linus Torvalds authored
      Merge tag 'fscache-next-20180406' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      Pull fscache updates from David Howells:
       "Three patches that fix some of AFS's usage of fscache:
      
         (1) Need to invalidate the cache if a foreign data change is detected
             on the server.
      
         (2) Move the vnode ID uniquifier (equivalent to i_generation) from
             the auxiliary data to the index key to prevent a race between
             file delete and a subsequent file create seeing the same index
             key.
      
         (3) Need to retire cookies that correspond to files that we think got
             deleted on the server.
      
        Four patches to fix some things in fscache and cachefiles:
      
         (4) Fix a couple of checker warnings.
      
         (5) Correctly indicate to the end-of-operation callback whether an
             operation completed or was cancelled.
      
         (6) Add a check for multiple cookie relinquishment.
      
         (7) Fix a path through the asynchronous write that doesn't wake up a
             waiter for a page if the cache decides not to write that page,
             but discards it instead.
      
        A couple of patches to add tracepoints to fscache and cachefiles:
      
         (8) Add tracepoints for cookie operators, object state machine
             execution, cachefiles object management and cachefiles VFS
             operations.
      
         (9) Add tracepoints for fscache operation management and page
             wrangling.
      
        And then three development patches:
      
        (10) Attach the index key and auxiliary data to the cookie, pass this
             information through various fscache-netfs API functions and get
             rid of the callbacks to the netfs to get it.
      
             This means that the cache can get at this information, even if
             the netfs goes away. It also means that the cache can be lazy in
             updating the coherency data.
      
        (11) Pass the object data size through various fscache-netfs API
             rather than calling back to the netfs for it, and store the value
             in the object.
      
             This makes it easier to correctly resize the object, as the size
             is updated on writes to the cache, rather than calling back out
             to the netfs.
      
        (12) Maintain a catalogue of allocated cookies. This makes it possible
             to catch cookie collision up front rather than down in the bowels
             of the cache being run from a service thread from the object
             state machine.
      
             This will also make it possible in the future to reconnect to a
             cookie that's not gone dead yet because it's waiting for
             finalisation of the storage and also make it possible to bring
             cookies online if the cache is added after the cookie has been
             obtained"
      
      * tag 'fscache-next-20180406' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        fscache: Maintain a catalogue of allocated cookies
        fscache: Pass object size in rather than calling back for it
        fscache: Attach the index key and aux data to the cookie
        fscache: Add more tracepoints
        fscache: Add tracepoints
        fscache: Fix hanging wait on page discarded by writeback
        fscache: Detect multiple relinquishment of a cookie
        fscache: Pass the correct cancelled indications to fscache_op_complete()
        fscache, cachefiles: Fix checker warnings
        afs: Be more aggressive in retiring cached vnodes
        afs: Use the vnode ID uniquifier in the cache key not the aux data
        afs: Invalidate cache on server data change
      62f8e6c5
    • Linus Torvalds's avatar
      Merge tag 'vfio-v4.17-rc1' of git://github.com/awilliam/linux-vfio · f605ba97
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Adopt iommu_unmap_fast() interface to type1 backend
         (Suravee Suthikulpanit)
      
       - mdev sample driver fixup (Shunyong Yang)
      
       - More efficient PFN mapping handling in type1 backend
         (Jason Cai)
      
       - VFIO device ioeventfd interface (Alex Williamson)
      
       - Tag new vfio-platform sub-maintainer (Alex Williamson)
      
      * tag 'vfio-v4.17-rc1' of git://github.com/awilliam/linux-vfio:
        MAINTAINERS: vfio/platform: Update sub-maintainer
        vfio/pci: Add ioeventfd support
        vfio/pci: Use endian neutral helpers
        vfio/pci: Pull BAR mapping setup from read-write path
        vfio/type1: Improve memory pinning process for raw PFN mapping
        vfio-mdev/samples: change RDI interrupt condition
        vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
      f605ba97
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 016c6f25
      Linus Torvalds authored
      Pull fw_cfg, vhost updates from Michael Tsirkin:
       "This cleans up the qemu fw cfg device driver.
      
        On top of this, vmcore is dumped there on crash to help debugging
        with kASLR enabled.
      
        Also included are some fixes in vhost"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vhost: add vsock compat ioctl
        vhost: fix vhost ioctl signature to build with clang
        fw_cfg: write vmcoreinfo details
        crash: export paddr_vmcoreinfo_note()
        fw_cfg: add DMA register
        fw_cfg: add a public uapi header
        fw_cfg: handle fw_cfg_read_blob() error
        fw_cfg: remove inline from fw_cfg_read_blob()
        fw_cfg: fix sparse warnings around FW_CFG_FILE_DIR read
        fw_cfg: fix sparse warning reading FW_CFG_ID
        fw_cfg: fix sparse warnings with fw_cfg_file
        fw_cfg: fix sparse warnings in fw_cfg_sel_endianness()
        ptr_ring: fix build
      016c6f25
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 3c0d551e
      Linus Torvalds authored
      Pull PCI updates from Bjorn Helgaas:
      
       - move pci_uevent_ers() out of pci.h (Michael Ellerman)
      
       - skip ASPM common clock warning if BIOS already configured it (Sinan
         Kaya)
      
       - fix ASPM Coverity warning about threshold_ns (Gustavo A. R. Silva)
      
       - remove last user of pci_get_bus_and_slot() and the function itself
         (Sinan Kaya)
      
       - add decoding for 16 GT/s link speed (Jay Fang)
      
       - add interfaces to get max link speed and width (Tal Gilboa)
      
       - add pcie_bandwidth_capable() to compute max supported link bandwidth
         (Tal Gilboa)
      
       - add pcie_bandwidth_available() to compute bandwidth available to
         device (Tal Gilboa)
      
       - add pcie_print_link_status() to log link speed and whether it's
         limited (Tal Gilboa)
      
       - use PCI core interfaces to report when device performance may be
         limited by its slot instead of doing it in each driver (Tal Gilboa)
      
       - fix possible cpqphp NULL pointer dereference (Shawn Lin)
      
       - rescan more of the hierarchy on ACPI hotplug to fix Thunderbolt/xHCI
         hotplug (Mika Westerberg)
      
       - add support for PCI I/O port space that's neither directly accessible
         via CPU in/out instructions nor directly mapped into CPU physical
         memory space. This is fairly intrusive and includes minor changes to
         interfaces used for I/O space on most platforms (Zhichang Yuan, John
         Garry)
      
       - add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan,
         John Garry)
      
       - use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)
      
       - remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
         (Shawn Lin)
      
       - report quirk timings with dev_info (Bjorn Helgaas)
      
       - report quirks that take longer than 10ms (Bjorn Helgaas)
      
       - add and use Altera Vendor ID (Johannes Thumshirn)
      
       - tidy Makefiles and comments (Bjorn Helgaas)
      
       - don't set up INTx if MSI or MSI-X is enabled to align cris, frv,
         ia64, and mn10300 with x86 (Bjorn Helgaas)
      
       - move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
         Lawler)
      
       - merge pcieport_if.h into portdrv.h (Bjorn Helgaas)
      
       - move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
         Helgaas)
      
       - completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)
      
       - remove portdrv link order dependency (Bjorn Helgaas)
      
       - remove support for unused VC portdrv service (Bjorn Helgaas)
      
       - simplify portdrv feature permission checking (Bjorn Helgaas)
      
       - remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
         Helgaas)
      
       - remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)
      
       - use cached AER capability offset (Frederick Lawler)
      
       - don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)
      
       - rename pcie-dpc.c to dpc.c (Bjorn Helgaas)
      
       - use generic pci_mmap_resource_range() instead of powerpc and xtensa
         arch-specific versions (David Woodhouse)
      
       - support arbitrary PCI host bridge offsets on sparc (Yinghai Lu)
      
       - remove System and Video ROM reservations on sparc (Bjorn Helgaas)
      
       - probe for device reset support during enumeration instead of runtime
         (Bjorn Helgaas)
      
       - add ACS quirk for Ampere (née APM) root ports (Feng Kan)
      
       - add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas
         Vincent-Cross)
      
       - protect device restore with device lock (Sinan Kaya)
      
       - handle failure of FLR gracefully (Sinan Kaya)
      
       - handle CRS (config retry status) after device resets (Sinan Kaya)
      
       - skip various config reads for SR-IOV VFs as an optimization
         (KarimAllah Ahmed)
      
       - consolidate VPD code in vpd.c (Bjorn Helgaas)
      
       - add Tegra dependency on PCI_MSI_IRQ_DOMAIN (Arnd Bergmann)
      
       - add DT support for R-Car r8a7743 (Biju Das)
      
       - fix a PCI_EJECT vs PCI_BUS_RELATIONS race condition in Hyper-V host
         bridge driver that causes a general protection fault (Dexuan Cui)
      
       - fix Hyper-V host bridge hang in MSI setup on 1-vCPU VMs with SR-IOV
         (Dexuan Cui)
      
       - fix Hyper-V host bridge hang when ejecting a VF before setting up MSI
         (Dexuan Cui)
      
       - make several structures static (Fengguang Wu)
      
       - increase number of MSI IRQs supported by Synopsys DesignWare bridges
         from 32 to 256 (Gustavo Pimentel)
      
       - implemented multiplexed IRQ domain API and remove obsolete MSI IRQ
         API from DesignWare drivers (Gustavo Pimentel)
      
       - add Tegra power management support (Manikanta Maddireddy)
      
       - add Tegra loadable module support (Manikanta Maddireddy)
      
       - handle 64-bit BARs correctly in endpoint support (Niklas Cassel)
      
       - support optional regulator for HiSilicon STB (Shawn Guo)
      
       - use regulator bulk API for Qualcomm apq8064 (Srinivas Kandagatla)
      
       - support power supplies for Qualcomm msm8996 (Srinivas Kandagatla)
      
      * tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (123 commits)
        MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver
        HISI LPC: Add ACPI support
        ACPI / scan: Do not enumerate Indirect IO host children
        ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use
        HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings
        of: Add missing I/O range exception for indirect-IO devices
        PCI: Apply the new generic I/O management on PCI IO hosts
        PCI: Add fwnode handler as input param of pci_register_io_range()
        PCI: Remove __weak tag from pci_register_io_range()
        MAINTAINERS: Add missing /drivers/pci/cadence directory entry
        fm10k: Report PCIe link properties with pcie_print_link_status()
        net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth
        net/mlx5: Report PCIe link properties with pcie_print_link_status()
        net/mlx4_core: Report PCIe link properties with pcie_print_link_status()
        PCI: Add pcie_print_link_status() to log link speed and whether it's limited
        PCI: Add pcie_bandwidth_available() to compute bandwidth available to device
        misc: pci_endpoint_test: Handle 64-bit BARs properly
        PCI: designware-ep: Make dw_pcie_ep_reset_bar() handle 64-bit BARs properly
        PCI: endpoint: Make sure that BAR_5 does not have 64-bit flag set when clearing
        PCI: endpoint: Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct *epf_bar
        ...
      3c0d551e
    • Linus Torvalds's avatar
      Merge tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 19fd08b8
      Linus Torvalds authored
      Pull rdma updates from Jason Gunthorpe:
       "Doug and I are at a conference next week so if another PR is sent I
        expect it to only be bug fixes. Parav noted yesterday that there are
        some fringe case behavior changes in his work that he would like to
        fix, and I see that Intel has a number of rc looking patches for HFI1
        they posted yesterday.
      
        Parav is again the biggest contributor by patch count with his ongoing
        work to enable container support in the RDMA stack, followed by Leon
        doing syzkaller inspired cleanups, though most of the actual fixing
        went to RC.
      
        There is one uncomfortable series here fixing the user ABI to actually
        work as intended in 32 bit mode. There are lots of notes in the commit
        messages, but the basic summary is we don't think there is an actual
        32 bit kernel user of drivers/infiniband for several good reasons.
      
        However we are seeing people want to use a 32 bit user space with 64
        bit kernel, which didn't completely work today. So in fixing it we
        required a 32 bit rxe user to upgrade their userspace. rxe users are
        still already quite rare and we think a 32 bit one is non-existing.
      
         - Fix RDMA uapi headers to actually compile in userspace and be more
           complete
      
         - Three shared with netdev pull requests from Mellanox:
      
            * 7 patches, mostly to net with 1 IB related one at the back).
              This series addresses an IRQ performance issue (patch 1),
              cleanups related to the fix for the IRQ performance problem
              (patches 2-6), and then extends the fragmented completion queue
              support that already exists in the net side of the driver to the
              ib side of the driver (patch 7).
      
            * Mostly IB, with 5 patches to net that are needed to support the
              remaining 10 patches to the IB subsystem. This series extends
              the current 'representor' framework when the mlx5 driver is in
              switchdev mode from being a netdev only construct to being a
              netdev/IB dev construct. The IB dev is limited to raw Eth queue
              pairs only, but by having an IB dev of this type attached to the
              representor for a switchdev port, it enables DPDK to work on the
              switchdev device.
      
            * All net related, but needed as infrastructure for the rdma
              driver
      
         - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers
      
         - SRP performance updates
      
         - IB uverbs write path cleanup patch series from Leon
      
         - Add RDMA_CM support to ib_srpt. This is disabled by default. Users
           need to set the port for ib_srpt to listen on in configfs in order
           for it to be enabled
           (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port)
      
         - TSO and Scatter FCS support in mlx4
      
         - Refactor of modify_qp routine to resolve problems seen while
           working on new code that is forthcoming
      
         - More refactoring and updates of RDMA CM for containers support from
           Parav
      
         - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device
           memory' user API features
      
         - Infrastructure updates for the new IOCTL interface, based on
           increased usage
      
         - ABI compatibility bug fixes to fully support 32 bit userspace on 64
           bit kernel as was originally intended. See the commit messages for
           extensive details
      
         - Syzkaller bugs and code cleanups motivated by them"
      
      * tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits)
        IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
        IB/mlx5: Device memory mr registration support
        net/mlx5: Mkey creation command adjustments
        IB/mlx5: Device memory support in mlx5_ib
        net/mlx5: Query device memory capabilities
        IB/uverbs: Add device memory registration ioctl support
        IB/uverbs: Add alloc/free dm uverbs ioctl support
        IB/uverbs: Add device memory capabilities reporting
        IB/uverbs: Expose device memory capabilities to user
        RDMA/qedr: Fix wmb usage in qedr
        IB/rxe: Removed GID add/del dummy routines
        RDMA/qedr: Zero stack memory before copying to user space
        IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR
        IB/mlx5: Add information for querying IPsec capabilities
        IB/mlx5: Add IPsec support for egress and ingress
        {net,IB}/mlx5: Add ipsec helper
        IB/mlx5: Add modify_flow_action_esp verb
        IB/mlx5: Add implementation for create and destroy action_xfrm
        IB/uverbs: Introduce ESP steering match filter
        IB/uverbs: Add modify ESP flow_action
        ...
      19fd08b8
    • Linus Torvalds's avatar
      Merge tag 'mailbox-v4.17' of git://git.linaro.org/landing-teams/working/fujitsu/integration · 28da7be5
      Linus Torvalds authored
      Pull mailbox updates from Jassi Brar:
      
       - New Hi3660 mailbox driver
      
       - Fix TEGRA Kconfig warning
      
       - Broadcom: use dma_pool_zalloc instead of dma_pool_alloc+memset
      
      * tag 'mailbox-v4.17' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
        mailbox: Add support for Hi3660 mailbox
        dt-bindings: mailbox: Introduce Hi3660 controller binding
        mailbox: tegra: relax TEGRA_HSP_MBOX Kconfig dependencies
        maillbox: bcm-flexrm-mailbox: Use dma_pool_zalloc()
      28da7be5
  2. 06 Apr, 2018 31 commits
    • Tobin C. Harding's avatar
      MAINTAINERS: Update LEAKING_ADDRESSES · e875d33d
      Tobin C. Harding authored
      MAINTAINERS is out of date for leaking_addresses.pl. There is now a tree on
      kernel.org for development of this script.  We have a second maintainer now,
      thanks Tycho.  Development of this scripts was started on kernel-hardening
      mailing list so let's keep it there.
      
      Update maintainer details; Add mailing list, kernel.org hosted tree, and second
      maintainer.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      e875d33d
    • Tobin C. Harding's avatar
      leaking_addresses: check if file name contains address · c73dff59
      Tobin C. Harding authored
      Sometimes files may be created by using output from printk.  As the scan
      traverses the directory tree we should parse each path name and check if
      it is leaking an address.
      
      Add check for leaking address on each path name.
      Suggested-by: default avatarTycho Andersen <tycho@tycho.ws>
      Acked-by: default avatarTycho Andersen <tycho@tycho.ws>
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      c73dff59
    • Tobin C. Harding's avatar
      leaking_addresses: explicitly name variable used in regex · 2306a677
      Tobin C. Harding authored
      Currently sub routine may_leak_address() is checking regex against Perl
      special variable $_ which is _fortunately_ being set correctly in a loop
      before this sub routine is called.  We already have declared a variable
      to hold this value '$line' we should use it.
      
      Use $line in regex match instead of implicit $_
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      2306a677
    • Tobin C. Harding's avatar
      leaking_addresses: remove version number · 34827374
      Tobin C. Harding authored
      We have git now, we don't need a version number.  This was originally
      added because leaking_addresses.pl shamelessly (and mindlessly) copied
      checkpatch.pl
      
      Remove version number from script.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      34827374
    • Tobin C. Harding's avatar
      leaking_addresses: skip '/proc/1/syscall' · 2ad74293
      Tobin C. Harding authored
      The pointers listed in /proc/1/syscall are user pointers, and negative
      syscall args will show up like kernel addresses.
      
      For example
      
      /proc/31808/syscall: 0 0x3 0x55b107a38180 0x2000 0xffffffffffffffb0 \
      0x55b107a302d0 0x55b107a38180 0x7fffa313b8e8 0x7ff098560d11
      
      Skip parsing /proc/1/syscall
      Suggested-by: default avatarTycho Andersen <tycho@tycho.ws>
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      2ad74293
    • Tobin C. Harding's avatar
      leaking_addresses: skip all /proc/PID except /proc/1 · 472c9e10
      Tobin C. Harding authored
      When the system is idle it is likely that most files under /proc/PID
      will be identical for various processes.  Scanning _all_ the PIDs under
      /proc is unnecessary and implies that we are thoroughly scanning /proc.
      This is _not_ the case because there may be ways userspace can trigger
      creation of /proc files that leak addresses but were not present during
      a scan.  For these two reasons we should exclude all PID directories
      under /proc except '1/'
      
      Exclude all /proc/PID except /proc/1.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      472c9e10
    • Tobin C. Harding's avatar
      leaking_addresses: cache architecture name · 5e4bac34
      Tobin C. Harding authored
      Currently we are repeatedly calling `uname -m`.  This is causing the
      script to take a long time to run (more than 10 seconds to parse
      /proc/kallsyms).  We can use Perl state variables to cache the result of
      the first call to `uname -m`.  With this change in place the script
      scans the whole kernel in under a minute.
      
      Cache machine architecture in state variable.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      5e4bac34
    • Tobin C. Harding's avatar
      leaking_addresses: simplify path skipping · b401f56f
      Tobin C. Harding authored
      Currently script has multiple configuration arrays.  This is confusing,
      evident by the fact that a bunch of the entries are in the wrong place.
      We can simplify the code by just having a single array for absolute
      paths to skip and a single array for file names to skip wherever they
      appear in the scanned directory tree.  There are also currently multiple
      subroutines to handle the different arrays, we can reduce these to a
      single subroutine also.
      
      Simplify the path skipping code.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      b401f56f
    • Tobin C. Harding's avatar
      leaking_addresses: do not parse binary files · e2858cad
      Tobin C. Harding authored
      Currently script parses binary files.  Since we are scanning for
      readable kernel addresses there is no need to parse binary files.  We
      can use Perl to check if file is binary and skip parsing it if so.
      
      Do not parse binary files.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      e2858cad
    • Tobin C. Harding's avatar
      leaking_addresses: add 32-bit support · 1410fe4e
      Tobin C. Harding authored
      Currently script only supports x86_64 and ppc64.  It would be nice to be
      able to scan 32-bit machines also.  We can add support for 32-bit
      architectures by modifying how we check for false positives, taking
      advantage of the page offset used by the kernel, and using the correct
      regular expression.
      
      Support for 32-bit machines is enabled by the observation that the kernel
      addresses on 32-bit machines are larger [in value] than the page offset.
      We can use this to filter false positives when scanning the kernel for
      leaking addresses.
      
      Programmatic determination of the running architecture is not
      immediately obvious (current 32-bit machines return various strings from
      `uname -m`).  We therefore provide a flag to enable scanning of 32-bit
      kernels.  Also we can check the kernel config file for the offset and if
      not found default to 0xc0000000.  A command line option to parse in the
      page offset is also provided.  We do automatically detect architecture
      if running on ix86.
      
      Add support for 32-bit kernels.  Add a command line option for page
      offset.
      Suggested-by: default avatarKaiwan N Billimoria <kaiwan.billimoria@gmail.com>
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      1410fe4e
    • Tobin C. Harding's avatar
      leaking_addresses: add is_arch() wrapper subroutine · 5eb0da05
      Tobin C. Harding authored
      Currently there is duplicate code when checking the architecture type.
      We can remove the duplication by implementing a wrapper function
      is_arch().
      
      Implement and use wrapper function is_arch().
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      5eb0da05
    • Tobin C. Harding's avatar
      leaking_addresses: use system command to get arch · 6efb7458
      Tobin C. Harding authored
      Currently script uses Perl to get the machine architecture. This can be
      erroneous since Perl uses the architecture of the machine that Perl was
      compiled on not the architecture of the running machine. We should use
      the systems `uname` command instead.
      
      Use `uname -m` instead of Perl to get the machine architecture.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      6efb7458
    • Tobin C. Harding's avatar
      leaking_addresses: add support for 5 page table levels · 2f042c93
      Tobin C. Harding authored
      Currently script only supports 4 page table levels because of the way
      the kernel address regular expression is crafted. We can do better than
      this. Using previously added support for kernel configuration options we
      can get the number of page table levels defined by
      CONFIG_PGTABLE_LEVELS. Using this value a correct regular expression can
      be crafted. This only supports 5 page tables on x86_64.
      
      Add support for 5 page table levels on x86_64.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      2f042c93
    • Tobin C. Harding's avatar
      leaking_addresses: add support for kernel config file · f9d2a42d
      Tobin C. Harding authored
      Features that rely on the ability to get kernel configuration options
      are ready to be implemented in script. In preparation for this we can
      add support for kernel config options as a separate patch to ease
      review.
      
      Add support for locating and parsing kernel configuration file.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      f9d2a42d
    • Tobin C. Harding's avatar
      leaking_addresses: add range check for vsyscall memory · 87e37588
      Tobin C. Harding authored
      Currently script checks only first and last address in the vsyscall
      memory range. We can do better than this. When checking for false
      positives against $match, we can convert $match to a hexadecimal value
      then check if it lies within the range of vsyscall addresses.
      
      Check whole range of vsyscall addresses when checking for false
      positive.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      87e37588
    • Tobin C. Harding's avatar
      leaking_addresses: indent dependant options · 15d60a35
      Tobin C. Harding authored
      A number of the command line options to script are dependant on the
      option --input-raw being set. If we indent these options it makes
      explicit this dependency.
      
      Indent options dependant on --input-raw.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      15d60a35
    • Tobin C. Harding's avatar
      leaking_addresses: remove command examples · 6145de83
      Tobin C. Harding authored
      Currently help output includes command examples. These were cute when we
      first started development of this script but are unnecessary.
      
      Remove command examples.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      6145de83
    • Tobin C. Harding's avatar
      leaking_addresses: remove mention of kptr_restrict · 20cdfb5f
      Tobin C. Harding authored
      leaking_addresses.pl can be run with kptr_restrict==0 now, we don't need
      the comment about setting kptr_restrict any more.
      
      Remove comment suggesting setting kptr_restrict.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      20cdfb5f
    • Tobin C. Harding's avatar
      leaking_addresses: fix typo function not called · 6d23dd9b
      Tobin C. Harding authored
      Currently code uses a check against an undefined variable because the
      variable is a sub routine name and is not evaluated.
      
      Evaluate subroutine; add parenthesis to sub routine name.
      Signed-off-by: default avatarTobin C. Harding <me@tobin.cc>
      6d23dd9b
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 9eda2d2d
      Linus Torvalds authored
      Pull SELinux updates from Paul Moore:
       "A bigger than usual pull request for SELinux, 13 patches (lucky!)
        along with a scary looking diffstat.
      
        Although if you look a bit closer, excluding the usual minor
        tweaks/fixes, there are really only two significant changes in this
        pull request: the addition of proper SELinux access controls for SCTP
        and the encapsulation of a lot of internal SELinux state.
      
        The SCTP changes are the result of a multi-month effort (maybe even a
        year or longer?) between the SELinux folks and the SCTP folks to add
        proper SELinux controls. A special thanks go to Richard for seeing
        this through and keeping the effort moving forward.
      
        The state encapsulation work is a bit of janitorial work that came out
        of some early work on SELinux namespacing. The question of namespacing
        is still an open one, but I believe there is some real value in the
        encapsulation work so we've split that out and are now sending that up
        to you"
      
      * tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: wrap AVC state
        selinux: wrap selinuxfs state
        selinux: fix handling of uninitialized selinux state in get_bools/classes
        selinux: Update SELinux SCTP documentation
        selinux: Fix ltp test connect-syscall failure
        selinux: rename the {is,set}_enforcing() functions
        selinux: wrap global selinux state
        selinux: fix typo in selinux_netlbl_sctp_sk_clone declaration
        selinux: Add SCTP support
        sctp: Add LSM hooks
        sctp: Add ip option support
        security: Add support for SCTP security hooks
        netlabel: If PF_INET6, check sk_buff ip header version
      9eda2d2d
    • Linus Torvalds's avatar
      Merge tag 'audit-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · 6ad11bdd
      Linus Torvalds authored
      Pull audit updates from Paul Moore:
       "We didn't have anything to send for v4.16, but we're back with a
        little more than usual for v4.17.
      
        Eleven patches in total, most fall into the small fix category, but
        there are three non-trivial changes worth calling out:
      
         - the audit entry filter is being removed after deprecating it for
           quite a while (years of no one really using it because it turns out
           to be not very practical)
      
         - created our own version of "__mutex_owner()" because the locking
           folks were upset we were using theirs
      
         - improved our handling of kernel command line parameters to make
           them more forgiving
      
         - we fixed auditing of symlink operations
      
        Everything passes the audit-testsuite and as of a few minutes ago it
        merges well with your tree"
      
      * tag 'audit-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
        audit: add refused symlink to audit_names
        audit: remove path param from link denied function
        audit: link denied should not directly generate PATH record
        audit: make ANOM_LINK obey audit_enabled and audit_dummy_context
        audit: do not panic on invalid boot parameter
        audit: track the owner of the command mutex ourselves
        audit: return on memory error to avoid null pointer dereference
        audit: bail before bug check if audit disabled
        audit: deprecate the AUDIT_FILTER_ENTRY filter
        audit: session ID should not set arch quick field pointer
        audit: update bugtracker and source URIs
      6ad11bdd
    • Linus Torvalds's avatar
      Merge tag 'pstore-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 69824bcc
      Linus Torvalds authored
      Pull pstore updates from Kees Cook:
       "This cycle was almost entirely improvements to the pstore compression
        options, noted below:
      
         - Add lz4hc and 842 to pstore compression options (Geliang Tang)
      
         - Refactor to use crypto compression API (Geliang Tang)
      
         - Fix up Kconfig dependencies for compression (Arnd Bergmann)
      
         - Allow for run-time compression selection
      
         - Remove stack VLA usage"
      
      * tag 'pstore-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        pstore: fix crypto dependencies
        pstore: Use crypto compress API
        pstore/ram: Do not use stack VLA for parity workspace
        pstore: Select compression at runtime
        pstore: Avoid size casts for 842 compression
        pstore: Add lz4hc and 842 compression support
      69824bcc
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 3b54765c
      Linus Torvalds authored
      Merge updates from Andrew Morton:
      
       - a few misc things
      
       - ocfs2 updates
      
       - the v9fs maintainers have been missing for a long time. I've taken
         over v9fs patch slinging.
      
       - most of MM
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (116 commits)
        mm,oom_reaper: check for MMF_OOM_SKIP before complaining
        mm/ksm: fix interaction with THP
        mm/memblock.c: cast constant ULLONG_MAX to phys_addr_t
        headers: untangle kmemleak.h from mm.h
        include/linux/mmdebug.h: make VM_WARN* non-rvals
        mm/page_isolation.c: make start_isolate_page_range() fail if already isolated
        mm: change return type to vm_fault_t
        mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
        mm, page_alloc: wakeup kcompactd even if kswapd cannot free more memory
        kernel/fork.c: detect early free of a live mm
        mm: make counting of list_lru_one::nr_items lockless
        mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead() static
        block_invalidatepage(): only release page if the full page was invalidated
        mm: kernel-doc: add missing parameter descriptions
        mm/swap.c: remove @cold parameter description for release_pages()
        mm/nommu: remove description of alloc_vm_area
        zram: drop max_zpage_size and use zs_huge_class_size()
        zsmalloc: introduce zs_huge_class_size()
        mm: fix races between swapoff and flush dcache
        fs/direct-io.c: minor cleanups in do_blockdev_direct_IO
        ...
      3b54765c
    • Linus Torvalds's avatar
      Merge tag 'mtd/for-4.17' of git://git.infradead.org/linux-mtd · 3fd14cdc
      Linus Torvalds authored
      Pull MTD updates from Boris Brezillon:
       "MTD Core:
         - Remove support for asynchronous erase (not implemented by any of
           the existing drivers anyway)
         - Remove Cyrille from the list of SPI NOR and MTD maintainers
         - Fix kernel doc headers
         - Allow users to define the partitions parsers they want to test
           through a DT property (compatible of the partitions subnode)
         - Remove the bfin-async-flash driver (the only architecture using it
           has been removed)
         - Fix pagetest test
         - Add extra checks in mtd_erase()
         - Simplify the MTD partition creation logic and get rid of
           mtd_add_device_partitions()
      
        MTD Drivers:
         - Add endianness information to the physmap DT binding
         - Add Eon EN29LV400A IDs to JEDEC probe logic
         - Use %*ph where appropriate
      
        SPI NOR Drivers:
         - Make fsl-quaspi assign different names to MTD devices connected to
           the same QSPI controller
         - Remove an unneeded driver.bus assigned in the fsl-qspi driver
      
        NAND Core:
         - Prepare arrival of the SPI NAND subsystem by implementing a generic
           (interface-agnostic) layer to ease manipulation of NAND devices
         - Move onenand code base to the drivers/mtd/nand/ dir
         - Rework timing mode selection
         - Provide a generic way for NAND chip drivers to flag a specific
           GET/SET FEATURE operation as supported/unsupported
         - Stop embedding ONFI/JEDEC param page in nand_chip
      
        NAND Drivers:
         - Rework/cleanup of the mxc driver
         - Various cleanups in the vf610 driver
         - Migrate the fsmc and vf610 to ->exec_op()
         - Get rid of the pxa driver (replaced by marvell_nand)
         - Support ->setup_data_interface() in the GPMI driver
         - Fix probe error path in several drivers
         - Remove support for unused hw_syndrome mode in sunxi_nand
         - Various minor improvements"
      
      * tag 'mtd/for-4.17' of git://git.infradead.org/linux-mtd: (89 commits)
        dt-bindings: fsl-quadspi: Add the example of two SPI NOR
        mtd: fsl-quadspi: Distinguish the mtd device names
        mtd: nand: Fix some function description mismatches in core.c
        mtd: fsl-quadspi: Remove unneeded driver.bus assignment
        mtd: rawnand: marvell: Rename ->ecc_clk into ->core_clk
        mtd: rawnand: s3c2410: enhance the probe function error path
        mtd: rawnand: tango: fix probe function error path
        mtd: rawnand: sh_flctl: fix the probe function error path
        mtd: rawnand: omap2: fix the probe function error path
        mtd: rawnand: mxc: fix probe function error path
        mtd: rawnand: denali: fix probe function error path
        mtd: rawnand: davinci: fix probe function error path
        mtd: rawnand: cafe: fix probe function error path
        mtd: rawnand: brcmnand: fix probe function error path
        mtd: rawnand: sunxi: Stop supporting ECC_HW_SYNDROME mode
        mtd: rawnand: marvell: Fix clock resource by adding a register clock
        mtd: ftl: Use DIV_ROUND_UP()
        mtd: Fix some function description mismatches in mtdcore.c
        mtd: physmap_of: update struct map_info's swap as per map requirement
        dt-bindings: mtd-physmap: Add endianness supports
        ...
      3fd14cdc
    • Linus Torvalds's avatar
      Merge tag 'for-4.17/dm-changes' of... · 83c7c18b
      Linus Torvalds authored
      Merge tag 'for-4.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - DM core passthrough ioctl fix to retain reference to DM table, and
         that table's block devices, while issuing the ioctl to one of those
         block devices.
      
       - DM core passthrough ioctl fix to _not_ override the fmode_t used to
         issue the ioctl. Overriding by using the fmode_t that the block
         device was originally open with during DM table load is a liability.
      
       - Add DM core support for secure erase forwarding and update the DM
         linear and DM striped targets to support them.
      
       - A DM core 4.16 stable fix to allow abnormal IO (e.g. discard, write
         same, write zeroes) for targets that make use of the non-splitting IO
         variant (as is done for multipath or thinp when layered directly on
         NVMe).
      
       - Allow DM targets to return a payload in response to a DM message that
         they are sent. This is useful for DM targets that would like to
         provide statistics data in response to DM messages.
      
       - Update DM bufio to support non-power-of-2 block sizes. Numerous other
         related changes prepare the DM bufio code for this support.
      
       - Fix DM crypt to use a bounded amount of memory across the entire
         system. This is to avoid OOM that can otherwise occur in response to
         certain pathological IO workloads (e.g. discarding a large DM crypt
         device).
      
       - Add a 'check_at_most_once' feature to the DM verity target to allow
         verity to be used on mobile devices that have very limited resources.
      
       - Fix the DM integrity target to fail early if a keyed algorithm (e.g.
         HMAC) is to be used but the key isn't set.
      
       - Add non-power-of-2 support to the DM unstripe target.
      
       - Eliminate the use of a Variable Length Array in the DM stripe target.
      
       - Update the DM log-writes target to record metadata (REQ_META flag).
      
       - DM raid fixes for its nosync status and some variable range issues.
      
      * tag 'for-4.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (28 commits)
        dm: remove fmode_t argument from .prepare_ioctl hook
        dm: hold DM table for duration of ioctl rather than use blkdev_get
        dm raid: fix parse_raid_params() variable range issue
        dm verity: make verity_for_io_block static
        dm verity: add 'check_at_most_once' option to only validate hashes once
        dm bufio: don't embed a bio in the dm_buffer structure
        dm bufio: support non-power-of-two block sizes
        dm bufio: use slab cache for dm_buffer structure allocations
        dm bufio: reorder fields in dm_buffer structure
        dm bufio: relax alignment constraint on slab cache
        dm bufio: remove code that merges slab caches
        dm bufio: get rid of slab cache name allocations
        dm bufio: move dm-bufio.h to include/linux/
        dm bufio: delete outdated comment
        dm: add support for secure erase forwarding
        dm: backfill abnormal IO support to non-splitting IO submission
        dm raid: fix nosync status
        dm mpath: use DM_MAPIO_SUBMITTED instead of magic number 0 in process_queued_bios()
        dm stripe: get rid of a Variable Length Array (VLA)
        dm log writes: record metadata flag for better flags record
        ...
      83c7c18b
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 9022ca6b
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "Assorted stuff, including Christoph's I_DIRTY patches"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: move I_DIRTY_INODE to fs.h
        ubifs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
        ntfs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
        gfs2: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) calls
        fs: fold open_check_o_direct into do_dentry_open
        vfs: Replace stray non-ASCII homoglyph characters with their ASCII equivalents
        vfs: make sure struct filename->iname is word-aligned
        get rid of pointless includes of fs_struct.h
        [poll] annotate SAA6588_CMD_POLL users
      9022ca6b
    • Bjorn Helgaas's avatar
      Merge remote-tracking branch 'lorenzo/pci/cadence' into next · 5f764419
      Bjorn Helgaas authored
      * lorenzo/pci/cadence:
        MAINTAINERS: Add missing /drivers/pci/cadence directory entry
      5f764419
    • David Howells's avatar
      fscache: Maintain a catalogue of allocated cookies · ec0328e4
      David Howells authored
      Maintain a catalogue of allocated cookies so that cookie collisions can be
      handled properly.  For the moment, this just involves printing a warning
      and returning a NULL cookie to the caller of fscache_acquire_cookie(), but
      in future it might make sense to wait for the old cookie to finish being
      cleaned up.
      
      This requires the cookie key to be stored attached to the cookie so that we
      still have the key available if the netfs relinquishes the cookie.  This is
      done by an earlier patch.
      
      The catalogue also renders redundant fscache_netfs_list (used for checking
      for duplicates), so that can be removed.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAnna Schumaker <anna.schumaker@netapp.com>
      Tested-by: default avatarSteve Dickson <steved@redhat.com>
      ec0328e4
    • David Howells's avatar
      fscache: Pass object size in rather than calling back for it · ee1235a9
      David Howells authored
      Pass the object size in to fscache_acquire_cookie() and
      fscache_write_page() rather than the netfs providing a callback by which it
      can be received.  This makes it easier to update the size of the object
      when a new page is written that extends the object.
      
      The current object size is also passed by fscache to the check_aux
      function, obviating the need to store it in the aux data.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAnna Schumaker <anna.schumaker@netapp.com>
      Tested-by: default avatarSteve Dickson <steved@redhat.com>
      ee1235a9
    • Tetsuo Handa's avatar
      mm,oom_reaper: check for MMF_OOM_SKIP before complaining · 97b1255c
      Tetsuo Handa authored
      I got "oom_reaper: unable to reap pid:" messages when the victim thread
      was blocked inside free_pgtables() (which occurred after returning from
      unmap_vmas() and setting MMF_OOM_SKIP).  We don't need to complain when
      exit_mmap() already set MMF_OOM_SKIP.
      
        Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
        oom_reaper: unable to reap pid:7558 (a.out)
        a.out           D13272  7558   6931 0x00100084
        Call Trace:
         schedule+0x2d/0x80
         rwsem_down_write_failed+0x2bb/0x440
         call_rwsem_down_write_failed+0x13/0x20
         down_write+0x49/0x60
         unlink_file_vma+0x28/0x50
         free_pgtables+0x36/0x100
         exit_mmap+0xbb/0x180
         mmput+0x50/0x110
         copy_process.part.41+0xb61/0x1fe0
         _do_fork+0xe6/0x560
         do_syscall_64+0x74/0x230
         entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Link: http://lkml.kernel.org/r/201803221946.DHG65638.VFJHFtOSQLOMOF@I-love.SAKURA.ne.jpSigned-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      97b1255c
    • Claudio Imbrenda's avatar
      mm/ksm: fix interaction with THP · 77da2ba0
      Claudio Imbrenda authored
      This patch fixes a corner case for KSM.  When two pages belong or
      belonged to the same transparent hugepage, and they should be merged,
      KSM fails to split the page, and therefore no merging happens.
      
      This bug can be reproduced by:
      * making sure ksm is running (in case disabling ksmtuned)
      * enabling transparent hugepages
      * allocating a THP-aligned 1-THP-sized buffer
        e.g. on amd64: posix_memalign(&p, 1<<21, 1<<21)
      * filling it with the same values
        e.g. memset(p, 42, 1<<21)
      * performing madvise to make it mergeable
        e.g. madvise(p, 1<<21, MADV_MERGEABLE)
      * waiting for KSM to perform a few scans
      
      The expected outcome is that the all the pages get merged (1 shared and
      the rest sharing); the actual outcome is that no pages get merged (1
      unshared and the rest volatile)
      
      The reason of this behaviour is that we increase the reference count
      once for both pages we want to merge, but if they belong to the same
      hugepage (or compound page), the reference counter used in both cases is
      the one of the head of the compound page.  This means that
      split_huge_page will find a value of the reference counter too high and
      will fail.
      
      This patch solves this problem by testing if the two pages to merge
      belong to the same hugepage when attempting to merge them.  If so, the
      hugepage is split safely.  This means that the hugepage is not split if
      not necessary.
      
      Link: http://lkml.kernel.org/r/1521548069-24758-1-git-send-email-imbrenda@linux.vnet.ibm.comSigned-off-by: default avatarClaudio Imbrenda <imbrenda@linux.vnet.ibm.com>
      Co-authored-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      77da2ba0