1. 14 Aug, 2014 27 commits
    • Linus Torvalds's avatar
      Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 3b7b3e6e
      Linus Torvalds authored
      Pull kbuild updates from Michal Marek:
       - make clean also considers $(extra-m) and $(extra-) to be consistent
       - cleanup and fixes in scripts/Makefile.host
       - allow to override the name of the Python 2 executable with make
         PYTHON=... (only needed for ia64 in practice)
       - option to split debugingo into *.dwo files to save disk space if the
         compiler supports it (CONFIG_DEBUG_INFO_SPLIT)
       - option to use dwarf4 debuginfo if the compiler supports it
         (CONFIG_DEBUG_INFO_DWARF4)
       - fix for disabling certain warnings with clang
       - fix for unneeded rebuild with dash when a command contains
         backslashes
      
      * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        kbuild: Fix handling of backslashes in *.cmd files
        kbuild, LLVMLinux: Supress warnings unless W=1-3
        Kbuild: Add a option to enable dwarf4 v2
        kbuild: Support split debug info v4
        kbuild: allow to override Python command name
        kbuild: clean-up and bug fix of scripts/Makefile.host
        kbuild: clean up scripts/Makefile.host
        kbuild: drop shared library support from Makefile.host
        kbuild: fix a bug of C++ host program handling
        kbuild: fix a typo in scripts/Makefile.host
        scripts/Makefile.clean: clean also $(extra-m) and $(extra-)
      3b7b3e6e
    • Linus Torvalds's avatar
      Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · e3b1fd56
      Linus Torvalds authored
      Pull infiniband/rdma updates from Roland Dreier:
       "Main set of InfiniBand/RDMA updates for 3.17 merge window:
      
         - MR reregistration support
         - MAD support for RMPP in userspace
         - iSER and SRP initiator updates
         - ocrdma hardware driver updates
         - other fixes..."
      
      * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (52 commits)
        IB/srp: Fix return value check in srp_init_module()
        RDMA/ocrdma: report asic-id in query device
        RDMA/ocrdma: Update sli data structure for endianness
        RDMA/ocrdma: Obtain SL from device structure
        RDMA/uapi: Include socket.h in rdma_user_cm.h
        IB/srpt: Handle GID change events
        IB/mlx5: Use ARRAY_SIZE instead of sizeof/sizeof[0]
        IB/mlx4: Use ARRAY_SIZE instead of sizeof/sizeof[0]
        RDMA/amso1100: Check for integer overflow in c2_alloc_cq_buf()
        IPoIB: Remove unnecessary test for NULL before debugfs_remove()
        IB/mad: Add user space RMPP support
        IB/mad: add new ioctl to ABI to support new registration options
        IB/mad: Add dev_notice messages for various umad/mad registration failures
        IB/mad: Update module to [pr|dev]_* style print messages
        IB/ipoib: Avoid multicast join attempts with invalid P_key
        IB/umad: Update module to [pr|dev]_* style print messages
        IB/ipoib: Avoid flushing the workqueue from worker context
        IB/ipoib: Use P_Key change event instead of P_Key polling mechanism
        IB/ipath: Add P_Key change event support
        mlx4_core: Add support for secure-host and SMP firewall
        ...
      e3b1fd56
    • John Stultz's avatar
      timekeeping: Another fix to the VSYSCALL_OLD update_vsyscall · 0680eb1f
      John Stultz authored
      Benjamin Herrenschmidt pointed out that I further missed modifying
      update_vsyscall after the wall_to_mono value was changed to a
      timespec64.  This causes issues on powerpc32, which expects a 32bit
      timespec.
      
      This patch fixes the problem by properly converting from a timespec64 to
      a timespec before passing the value on to the arch-specific vsyscall
      logic.
      
      [ Thomas is currently on vacation, but reviewed it and wanted me to send
        this fix on to you directly. ]
      
      Cc: LKML <linux-kernel@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Reported-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0680eb1f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (fixes from Andrew Morton) · f2937e45
      Linus Torvalds authored
      Merge leftovers from Andrew Morton:
       "A few leftovers.
      
        I have a bunch of OCFS2 patches which are still out for review and
        which I might sneak along after -rc1.  Partly my fault - I should send
        my review pokes out earlier"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm: fix CROSS_MEMORY_ATTACH help text grammar
        drivers/mfd/rtsx_usb.c: export device table
        mm, hugetlb_cgroup: align hugetlb cgroup limit to hugepage size
      f2937e45
    • Geert Uytterhoeven's avatar
    • Jeff Mahoney's avatar
      drivers/mfd/rtsx_usb.c: export device table · 18139089
      Jeff Mahoney authored
      The rtsx_usb driver contains the table for the devices it supports but
      doesn't export it.  As a result, no alias is generated and it doesn't
      get loaded automatically.
      
      Via https://bugzilla.novell.com/show_bug.cgi?id=890096Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reported-by: default avatarMarcel Witte <wittemar@googlemail.com>
      Cc: Roger Tseng <rogerable@realtek.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18139089
    • David Rientjes's avatar
      mm, hugetlb_cgroup: align hugetlb cgroup limit to hugepage size · 24d7cd20
      David Rientjes authored
      Memcg aligns memory.limit_in_bytes to PAGE_SIZE as part of the resource
      counter since it makes no sense to allow a partial page to be charged.
      
      As a result of the hugetlb cgroup using the resource counter, it is also
      aligned to PAGE_SIZE but makes no sense unless aligned to the size of
      the hugepage being limited.
      
      Align hugetlb cgroup limit to hugepage size.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      24d7cd20
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 1d508f8a
      Linus Torvalds authored
      Pull more powerpc updates from Ben Herrenschmidt:
       "Here are some more powerpc bits for 3.17, essentially fixes.
      
        The biggest series, also aimed at -stable, is from Aneesh and is the
        result of weeks and weeks of debugging to find out why the heck or THP
        implementation was occasionally triggering multi-hit errors in our
        level 1 TLB.  It ended up being a combination of issues including
        subtleties as to how we should invalidate those special 'MPSS' pages
        we use to allow the use of 16M pages inside 4K/64K "base page size"
        segments (you really have to love our MMU !)
      
        Another interesting one in the "OMG" category is the series from
        Michael adding memory barriers to spin_is_locked().  That's also the
        result of many days of debugging to figure out why the semaphore code
        would occasionally crash in ways that made no sense.  It ended up
        being some creative lock stacking that was defeated by the fact that
        our locks allow a load inside the locked section to be re-ordered with
        the load of the lock value itself (I'm still of two mind about whether
        to kill that once and for all by putting a heavier barrier back into
        our lock implementation...).  The fixes come with a long explanation
        in the cset comments, feel free to read it if you feel like having a
        headache today"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
        powerpc/thp: Add tracepoints to track hugepage invalidate
        powerpc/mm: Use read barrier when creating real_pte
        powerpc/thp: Use ACCESS_ONCE when loading pmdp
        powerpc/thp: Invalidate with vpn in loop
        powerpc/thp: Handle combo pages in invalidate
        powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte
        powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
        powerpc/thp: Add write barrier after updating the valid bit
        powerpc: reorder per-cpu NUMA information's initialization
        powerpc/perf/hv-24x7: Use kmem_cache_free
        powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info
        powerpc: Hard disable interrupts in xmon
        powerpc: remove duplicate definition of TEXASR_FS
        powerpc/pseries: Avoid deadlock on removing ddw
        powerpc/pseries: Failure on removing device node
        powerpc/boot: Use correct zlib types for comparison
        powerpc/powernv: Interface to register/unregister opal dump region
        printk: Add function to return log buffer address and size
        powerpc: Add POWER8 features to CPU_FTRS_POSSIBLE/ALWAYS
        powerpc/ppc476: Disable BTAC
        ...
      1d508f8a
    • Linus Torvalds's avatar
      Merge tag 'hwspinlock-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock · 2d0c05e1
      Linus Torvalds authored
      Pull hwspinlock updates from Ohad Ben-Cohen:
       "Two small hwspinlock changes for better OMAP support, coming from
        Suman Anna"
      
      * tag 'hwspinlock-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock:
        hwspinlock: enable OMAP build for AM33xx, AM43xx & DRA7xx
        hwspinlock/omap: enable module before reading SYSSTATUS register
      2d0c05e1
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 311bf6d1
      Linus Torvalds authored
      Pull seccomp fix from James Morris.
      
      BUG(!spin_is_locked()) really doesn't work very well in UP
      configurations without any actual spinlock state.  Which is very much
      why we have that "assert_spin_lock()" function for this.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock
      311bf6d1
    • Roland Dreier's avatar
      Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc',... · d087f6ad
      Roland Dreier authored
      Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc', 'mlx4', 'mlx5', 'ocrdma' and 'srp' into for-next
      d087f6ad
    • Wei Yongjun's avatar
      IB/srp: Fix return value check in srp_init_module() · da05be29
      Wei Yongjun authored
      In case of error, the function create_workqueue() returns NULL pointer
      not ERR_PTR().  The IS_ERR() test in the return value check should be
      replaced with NULL test.
      Signed-off-by: default avatarWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Acked-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      da05be29
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 82f05a08
      Linus Torvalds authored
      Pull hwmon fixes from Guenter Roeck:
       "Several bug fixes in various drivers, plus a minor cleanup in the
        tmp103 driver"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (tmp103) Remove duplicate test for I2C_FUNC_SMBUS_BYTE_DATA functionality
        hwmon: (w83793) Fix vrm write operation
        hwmon: (w83791d) Fix vrm write operation
        hwmon: (w83627hf) Fix vrm write operation
        hwmon: (vt1211) Fix vrm write operation
        hwmon: (pc87360) Fix vrm write operation
        hwmon: (lm87) Fix vrm write operation
        hwmon: (asb100) Fix vrm write operation
        hwmon: (adm1026) Fix vrm write operation
        hwmon: (adm1025) Fix vrm write operation
        hwmon: (hih6130) Fix missing hih6130->write_length setting
        hwmon: (dme1737) Prevent overflow problem when writing large limits
        hwmon: (emc6w201) Fix temperature limit range
        hwmon: (ads1015) Fix out-of-bounds array access
        hwmon: (lm92) Prevent overflow problem when writing large limits
      82f05a08
    • Linus Torvalds's avatar
      Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux · ae36e95c
      Linus Torvalds authored
      Pull device tree updates from Grant Likely:
       "The branch contains the following device tree changes the v3.17 merge
        window:
      
        Group changes to the device tree.  In preparation for adding device
        tree overlay support, OF_DYNAMIC is reworked so that a set of device
        tree changes can be prepared and applied to the tree all at once.
        OF_RECONFIG notifiers see the most significant change here so that
        users always get a consistent view of the tree.  Notifiers generation
        is moved from before a change to after it, and notifiers for a group
        of changes are emitted after the entire block of changes have been
        applied
      
        Automatic console selection from DT.  Console drivers can now use
        of_console_check() to see if the device node is specified as a console
        device.  If so then it gets added as a preferred console.  UART
        devices get this support automatically when uart_add_one_port() is
        called.
      
        DT unit tests no longer depend on pre-loaded data in the device tree.
        Data is loaded dynamically at the start of unit tests, and then
        unloaded again when the tests have completed.
      
        Also contains a few bugfixes for reserved regions and early memory
        setup"
      
      * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux: (21 commits)
        of: Fixing OF Selftest build error
        drivers: of: add automated assignment of reserved regions to client devices
        of: Use proper types for checking memory overflow
        of: typo fix in __of_prop_dup()
        Adding selftest testdata dynamically into live tree
        of: Add todo tasklist for Devicetree
        of: Transactional DT support.
        of: Reorder device tree changes and notifiers
        of: Move dynamic node fixups out of powerpc and into common code
        of: Make sure attached nodes don't carry along extra children
        of: Make devicetree sysfs update functions consistent.
        of: Create unlocked versions of node and property add/remove functions
        OF: Utility helper functions for dynamic nodes
        of: Move CONFIG_OF_DYNAMIC code into a separate file
        of: rename of_aliases_mutex to just of_mutex
        of/platform: Fix of_platform_device_destroy iteration of devices
        of: Migrate of_find_node_by_name() users to for_each_node_by_name()
        tty: Update hypervisor tty drivers to use core stdout parsing code.
        arm/versatile: Add the uart as the stdout device.
        of: Enable console on serial ports specified by /chosen/stdout-path
        ...
      ae36e95c
    • Linus Torvalds's avatar
      Merge tag 'vfio-v3.17-rc1' of git://github.com/awilliam/linux-vfio · cc8a44c6
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
       - enable support for bus reset on device release
       - fixes for EEH support
      
      * tag 'vfio-v3.17-rc1' of git://github.com/awilliam/linux-vfio:
        drivers/vfio: Enable VFIO if EEH is not supported
        drivers/vfio: Allow EEH to be built as module
        drivers/vfio: Fix EEH build error
        vfio-pci: Attempt bus/slot reset on release
        vfio-pci: Use mutex around open, release, and remove
        vfio-pci: Release devices with BusMaster disabled
      cc8a44c6
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.17-b-rc0-tag' of... · 4dacb91c
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.17-b-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
      
      Pull Xen bugfixes from David Vrabel:
       - fix ARM build
       - fix boot crash with PVH guests
       - improve reliability of resume/migration
      
      * tag 'stable/for-linus-3.17-b-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        x86/xen: use vmap() to map grant table pages in PVH guests
        x86/xen: resume timer irqs early
        arm/xen: remove duplicate arch_gnttab_init() function
      4dacb91c
    • Linus Torvalds's avatar
      Merge tag 'dm-3.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · ba368991
      Linus Torvalds authored
      Pull device mapper changes from Mike Snitzer:
      
       - Allow the thin target to paired with any size external origin; also
         allow thin snapshots to be larger than the external origin.
      
       - Add support for quickly loading a repetitive pattern into the
         dm-switch target.
      
       - Use per-bio data in the dm-crypt target instead of always using a
         mempool for each allocation.  Required switching to kmalloc alignment
         for the bio slab.
      
       - Fix DM core to properly stack the QUEUE_FLAG_NO_SG_MERGE flag
      
       - Fix the dm-cache and dm-thin targets' export of the minimum_io_size
         to match the data block size -- this fixes an issue where mkfs.xfs
         would improperly infer raid striping was in place on the underlying
         storage.
      
       - Small cleanups in dm-io, dm-mpath and dm-cache
      
      * tag 'dm-3.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm table: propagate QUEUE_FLAG_NO_SG_MERGE
        dm switch: efficiently support repetitive patterns
        dm switch: factor out switch_region_table_read
        dm cache: set minimum_io_size to cache's data block size
        dm thin: set minimum_io_size to pool's data block size
        dm crypt: use per-bio data
        block: use kmalloc alignment for bio slab
        dm table: make dm_table_supports_discards static
        dm cache metadata: use dm-space-map-metadata.h defined size limits
        dm cache: fail migrations in the do_worker error path
        dm cache: simplify deferred set reference count increments
        dm thin: relax external origin size constraints
        dm thin: switch to an atomic_t for tracking pending new block preparations
        dm mpath: eliminate pg_ready() wrapper
        dm io: simplify dec_count and sync_io
      ba368991
    • Linus Torvalds's avatar
      Merge tag 'mmc-v3.17-1' of git://git.linaro.org/people/ulf.hansson/mmc · a8e4def6
      Linus Torvalds authored
      Pull MMC updates from Ulf Hansson:
       "Me and Chris Ball decided to try out using my MMC tree as the primary
        one, to simplify handling of patches.
      
        This pull does thus contains all the MMC patches for 3.17 rc1, no pull
        from Chris this time.
      
        Details:
      
        MMC core:
         - forward compatibility for eMMC
         - fix some blacklisted cards with broken secure discard
      
        MMC host:
         - mmci: Add support for Qualcomm variant
         - mmci: Fix regression for arm_variant
         - sdhci: Various fixes and cleanups
         - sdhci: Improve external VDD regulator support
         - sdhci: Support for DDR50 1.8V mode for BayTrail
         - sdhci-st: Add driver for ST SDHCI controller
         - sh-mmcif: DMA fixes
         - omap_hsmmc: Add support for SDIO interrupts
         - sdhci-pci: Add support for Intel Quark X1000
         - dw_mmc: Update the reset sequence
         - s3cmci: port DMA code to dmaengine API"
      
      * tag 'mmc-v3.17-1' of git://git.linaro.org/people/ulf.hansson/mmc: (67 commits)
        mmc: dw_mmc: modify the dt-binding for removing slot-node and supports-highspeed
        mmc: dw_mmc: Slot quirk "disable-wp" is deprecated.
        mmc: mmci: Reverse IRQ handling for the arm_variant
        mmc: mmci: Move all CMD irq handling to mmci_cmd_irq()
        mmc: mmci: Remove redundant check of status for DATA irq
        mmc: dw_mmc: change to use recommended reset procedure
        mmc: sdhci-pxav3: Use devm_* managed helpers
        mmc: tmio: Configure DMA slave bus width
        mmc: sh_mmcif: Configure DMA slave bus width
        mmc: sh_mmcif: Fix DMA slave address configuration
        mmc: sh_mmcif: Document DT bindings
        mmc: sdhci-pci: remove PCI PM functions in suspend/resume callback
        mmc: Do not advertise secure discard if it is blacklisted
        mmc: sdhci-msm: Get COMPILE_TEST support
        mmc: sdhci-msm: Remove unnecessary header file inclusion
        mmc: sdhci-msm: Fix the binding example
        mmc: sdhci: add DDR50 1.8V mode support for BayTrail eMMC Controller
        mmc: sdhci: Preset value not supported in Baytrail eMMC
        mmc: MMC_USDHI6ROL0 should depend on HAS_DMA
        mmc: MMC_SH_MMCIF should depend on HAS_DMA
        ...
      a8e4def6
    • Linus Torvalds's avatar
      Merge branch 'for-3.17/drivers' of git://git.kernel.dk/linux-block · d429a363
      Linus Torvalds authored
      Pull block driver changes from Jens Axboe:
       "Nothing out of the ordinary here, this pull request contains:
      
         - A big round of fixes for bcache from Kent Overstreet, Slava Pestov,
           and Surbhi Palande.  No new features, just a lot of fixes.
      
         - The usual round of drbd updates from Andreas Gruenbacher, Lars
           Ellenberg, and Philipp Reisner.
      
         - virtio_blk was converted to blk-mq back in 3.13, but now Ming Lei
           has taken it one step further and added support for actually using
           more than one queue.
      
         - Addition of an explicit SG_FLAG_Q_AT_HEAD for block/bsg, to
           compliment the the default behavior of adding to the tail of the
           queue.  From Douglas Gilbert"
      
      * 'for-3.17/drivers' of git://git.kernel.dk/linux-block: (86 commits)
        bcache: Drop unneeded blk_sync_queue() calls
        bcache: add mutex lock for bch_is_open
        bcache: Correct printing of btree_gc_max_duration_ms
        bcache: try to set b->parent properly
        bcache: fix memory corruption in init error path
        bcache: fix crash with incomplete cache set
        bcache: Fix more early shutdown bugs
        bcache: fix use-after-free in btree_gc_coalesce()
        bcache: Fix an infinite loop in journal replay
        bcache: fix crash in bcache_btree_node_alloc_fail tracepoint
        bcache: bcache_write tracepoint was crashing
        bcache: fix typo in bch_bkey_equal_header
        bcache: Allocate bounce buffers with GFP_NOWAIT
        bcache: Make sure to pass GFP_WAIT to mempool_alloc()
        bcache: fix uninterruptible sleep in writeback thread
        bcache: wait for buckets when allocating new btree root
        bcache: fix crash on shutdown in passthrough mode
        bcache: fix lockdep warnings on shutdown
        bcache allocator: send discards with correct size
        bcache: Fix to remove the rcu_sched stalls.
        ...
      d429a363
    • Linus Torvalds's avatar
      Merge branch 'for-3.17/core' of git://git.kernel.dk/linux-block · 4a319a49
      Linus Torvalds authored
      Pull block core bits from Jens Axboe:
       "Small round this time, after the massive blk-mq dump for 3.16.  This
        pull request contains:
      
         - Fixes for max_sectors overflow in ioctls from Akinoby Mita.
      
         - Partition off-by-one bug fix in aix partitions from Dan Carpenter.
      
         - Various small partition cleanups from Fabian Frederick.
      
         - Fix for the block integrity code sometimes returning the wrong
           vector count from Gu Zheng.
      
         - Cleanup an re-org of the blk-mq queue enter/exit percpu counters
           from Tejun.  Dependent on the percpu pull for 3.17 (which was in
           the block tree too), that you have already pulled in.
      
         - A blkcg oops fix, also from Tejun"
      
      * 'for-3.17/core' of git://git.kernel.dk/linux-block:
        partitions: aix.c: off by one bug
        blkcg: don't call into policy draining if root_blkg is already gone
        Revert "bio: modify __bio_add_page() to accept pages that don't start a new segment"
        bio: modify __bio_add_page() to accept pages that don't start a new segment
        block: fix SG_[GS]ET_RESERVED_SIZE ioctl when max_sectors is huge
        block: fix BLKSECTGET ioctl when max_sectors is greater than USHRT_MAX
        block/partitions/efi.c: kerneldoc fixing
        block/partitions/msdos.c: code clean-up
        block/partitions/amiga.c: replace nolevel printk by pr_err
        block/partitions/aix.c: replace count*size kzalloc by kcalloc
        bio-integrity: add "bip_max_vcnt" into struct bio_integrity_payload
        blk-mq: use percpu_ref for mq usage count
        blk-mq: collapse __blk_mq_drain_queue() into blk_mq_freeze_queue()
        blk-mq: decouble blk-mq freezing from generic bypassing
        block, blk-mq: draining can't be skipped even if bypass_depth was non-zero
        blk-mq: fix a memory ordering bug in blk_mq_queue_enter()
      4a319a49
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f0094b28
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Several networking final fixes and tidies for the merge window:
      
         1) Changes during the merge window unintentionally took away the
            ability to build bluetooth modular, fix from Geert Uytterhoeven.
      
         2) Several phy_node reference count bug fixes from Uwe Kleine-König.
      
         3) Fix ucc_geth build failures, also from Uwe Kleine-König.
      
         4) Fix klog false positivies when netlink messages go to network
            taps, by properly resetting the network header.  Fix from Daniel
            Borkmann.
      
         5) Sizing estimate of VF netlink messages is too small, from Jiri
            Benc.
      
         6) New APM X-Gene SoC ethernet driver, from Iyappan Subramanian.
      
         7) VLAN untagging is erroneously dependent upon whether the VLAN
            module is loaded or not, but there are generic dependencies that
            matter wrt what can be expected as the SKB enters the stack.
            Make the basic untagging generic code, and do it unconditionally.
            From Vlad Yasevich.
      
         8) xen-netfront only has so many slots in it's transmit queue so
            linearize packets that have too many frags.  From Zoltan Kiss.
      
         9) Fix suspend/resume PHY handling in bcmgenet driver, from Florian
            Fainelli"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (55 commits)
        net: bcmgenet: correctly resume adapter from Wake-on-LAN
        net: bcmgenet: update UMAC_CMD only when link is detected
        net: bcmgenet: correctly suspend and resume PHY device
        net: bcmgenet: request and enable main clock earlier
        net: ethernet: myricom: myri10ge: myri10ge.c: Cleaning up missing null-terminate after strncpy call
        xen-netfront: Fix handling packets on compound pages with skb_linearize
        net: fec: Support phys probed from devicetree and fixed-link
        smsc: replace WARN_ON() with WARN_ON_SMP()
        xen-netback: Don't deschedule NAPI when carrier off
        net: ethernet: qlogic: qlcnic: Remove duplicate object file from Makefile
        wan: wanxl: Remove typedefs from struct names
        m68k/atari: EtherNEC - ethernet support (ne)
        net: ethernet: ti: cpmac.c: Cleaning up missing null-terminate after strncpy call
        hdlc: Remove typedefs from struct names
        airo_cs: Remove typedef local_info_t
        atmel: Remove typedef atmel_priv_ioctl
        com20020_cs: Remove typedef com20020_dev_t
        ethernet: amd: Remove typedef local_info_t
        net: Always untag vlan-tagged traffic on input.
        drivers: net: Add APM X-Gene SoC ethernet driver support.
        ...
      f0094b28
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 13b102bf
      Linus Torvalds authored
      Pull Sparc fixes from David Miller:
       "Sparc bug fixes, one of which was preventing successful SMP boots with
        mainline"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Fix pcr_ops initialization and usage bugs.
        sparc64: Do not disable interrupts in nmi_cpu_busy()
        sparc: Hook up seccomp and getrandom system calls.
        sparc: fix decimal printf format specifiers prefixed with 0x
      13b102bf
    • Linus Torvalds's avatar
      Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 81c02a21
      Linus Torvalds authored
      Pull x86/apic updates from Thomas Gleixner:
       "This is a major overhaul to the x86 apic subsystem consisting of the
        following parts:
      
         - Remove obsolete APIC driver abstractions (David Rientjes)
      
         - Use the irqdomain facilities to dynamically allocate IRQs for
           IOAPICs.  This is a prerequisite to enable IOAPIC hotplug support,
           and it also frees up wasted vectors (Jiang Liu)
      
         - Misc fixlets.
      
        Despite the hickup in Ingos previous pull request - caused by the
        missing fixup for the suspend/resume issue reported by Borislav - I
        strongly recommend that this update finds its way into 3.17.  Some
        history for you:
      
        This is preparatory work for physical IOAPIC hotplug.  The first
        attempt to support this was done by Yinghai and I shot it down because
        it just added another layer of obscurity and complexity to the already
        existing mess without tackling the underlying shortcomings of the
        current implementation.
      
        After quite some on- and offlist discussions, I requested that the
        design of this functionality must use generic infrastructure, i.e.
        irq domains, which provide all the mechanisms to dynamically map linux
        interrupt numbers to physical interrupts.
      
        Jiang picked up the idea and did a great job of consolidating the
        existing interfaces to manage the x86 (IOAPIC) interrupt system by
        utilizing irq domains.
      
        The testing in tip, Linux-next and inside of Intel on various machines
        did not unearth any oddities until Borislav exposed it to one of his
        oddball machines.  The issue was resolved quickly, but unfortunately
        the fix fell through the cracks and did not hit the tip tree before
        Ingo sent the pull request.  Not entirely Ingos fault, I also assumed
        that the fix was already merged when Ingo asked me whether he could
        send it.
      
        Nevertheless this work has a proper design, has undergone several
        rounds of review and the final fallout after applying it to tip and
        integrating it into Linux-next has been more than moderate.  It's the
        ground work not only for IOAPIC hotplug, it will also allow us to move
        the lowlevel vector allocation into the irqdomain hierarchy, which
        will benefit other architectures as well.  Patches are posted already,
        but they are on hold for two weeks, see below.
      
        I really appreciate the competence and responsiveness Jiang has shown
        in course of this endavour.  So I'm sure that any fallout of this will
        be addressed in a timely manner.
      
        FYI, I'm vanishing for 2 weeks into my annual kids summer camp kitchen
        duty^Wvacation, while you folks are drooling at KS/LinuxCon :) But HPA
        will have a look at the hopefully zero fallout until I'm back"
      
      * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
        x86, irq, PCI: Keep IRQ assignment for PCI devices during suspend/hibernation
        x86/apic/vsmp: Make is_vsmp_box() static
        x86, apic: Remove enable_apic_mode callback
        x86, apic: Remove setup_portio_remap callback
        x86, apic: Remove multi_timer_check callback
        x86, apic: Replace noop_check_apicid_used
        x86, apic: Remove check_apicid_present callback
        x86, apic: Remove mps_oem_check callback
        x86, apic: Remove smp_callin_clear_local_apic callback
        x86, apic: Replace trampoline physical addresses with defaults
        x86, apic: Remove x86_32_numa_cpu_node callback
        x86: intel-mid: Use the new io_apic interfaces
        x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
        x86, irq: Clean up irqdomain transition code
        x86, irq, devicetree: Release IOAPIC pin when PCI device is disabled
        x86, irq, SFI: Release IOAPIC pin when PCI device is disabled
        x86, irq, mpparse: Release IOAPIC pin when PCI device is disabled
        x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled
        x86, irq: Introduce helper functions to release IOAPIC pin
        x86, irq: Simplify the way to handle ISA IRQ
        ...
      81c02a21
    • Linus Torvalds's avatar
      Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d27c0d90
      Linus Torvalds authored
      Pull x86/efix fixes from Peter Anvin:
       "Two EFI-related Kconfig changes, which happen to touch immediately
        adjacent lines in Kconfig and thus collapse to a single patch"
      
      * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/efi: Enforce CONFIG_RELOCATABLE for EFI boot stub
        x86/efi: Fix 3DNow optimization build failure in EFI stub
      d27c0d90
    • Linus Torvalds's avatar
      Merge branch 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7453f33b
      Linus Torvalds authored
      Pull x86/xsave changes from Peter Anvin:
       "This is a patchset to support the XSAVES instruction required to
        support context switch of supervisor-only features in upcoming
        silicon.
      
        This patchset missed the 3.16 merge window, which is why it is based
        on 3.15-rc7"
      
      * 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, xsave: Add forgotten inline annotation
        x86/xsaves: Clean up code in xstate offsets computation in xsave area
        x86/xsave: Make it clear that the XSAVE macros use (%edi)/(%rdi)
        Define kernel API to get address of each state in xsave area
        x86/xsaves: Enable xsaves/xrstors
        x86/xsaves: Call booting time xsaves and xrstors in setup_init_fpu_buf
        x86/xsaves: Save xstate to task's xsave area in __save_fpu during booting time
        x86/xsaves: Add xsaves and xrstors support for booting time
        x86/xsaves: Clear reserved bits in xsave header
        x86/xsaves: Use xsave/xrstor for saving and restoring user space context
        x86/xsaves: Use xsaves/xrstors for context switch
        x86/xsaves: Use xsaves/xrstors to save and restore xsave area
        x86/xsaves: Define a macro for handling xsave/xrstor instruction fault
        x86/xsaves: Define macros for xsave instructions
        x86/xsaves: Change compacted format xsave area header
        x86/alternative: Add alternative_input_2 to support alternative with two features and input
        x86/xsaves: Add a kernel parameter noxsaves to disable xsaves/xrstors
      7453f33b
    • Linus Torvalds's avatar
      Merge tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag · fd1cf905
      Linus Torvalds authored
      Pull metag architecture updates from James Hogan:
       "Just a couple of minor static analysis fixes, removal of a NULL check
        that should never happen, and fix an error check where an unsigned
        value was being checked to see if it was negative"
      
      * tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
        metag: cachepart: Fix failure check
        metag: hugetlbpage: Remove null pointer checks that could never happen
      fd1cf905
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 06b8ab55
      Linus Torvalds authored
      Pull NFS client updates from Trond Myklebust:
       "Highlights include:
      
         - stable fix for a bug in nfs3_list_one_acl()
         - speed up NFS path walks by supporting LOOKUP_RCU
         - more read/write code cleanups
         - pNFS fixes for layout return on close
         - fixes for the RCU handling in the rpcsec_gss code
         - more NFS/RDMA fixes"
      
      * tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
        nfs: reject changes to resvport and sharecache during remount
        NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error
        SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred
        NFS: fix two problems in lookup_revalidate in RCU-walk
        NFS: allow lockless access to access_cache
        NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU
        NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU
        NFS: support RCU_WALK in nfs_permission()
        sunrpc/auth: allow lockless (rcu) lookup of credential cache.
        NFS: prepare for RCU-walk support but pushing tests later in code.
        NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used.
        NFS: add checks for returned value of try_module_get()
        nfs: clear_request_commit while holding i_lock
        pnfs: add pnfs_put_lseg_async
        pnfs: find swapped pages on pnfs commit lists too
        nfs: fix comment and add warn_on for PG_INODE_REF
        nfs: check wait_on_bit_lock err in page_group_lock
        sunrpc: remove "ec" argument from encrypt_v2 operation
        sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c
        sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c
        ...
      06b8ab55
  2. 13 Aug, 2014 13 commits
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfs · dc1cc851
      Linus Torvalds authored
      Pull xfs update from Dave Chinner:
       "This update contains:
         - conversion of the XFS core to pass negative error numbers
         - restructing of core XFS code that is shared with userspace to
           fs/xfs/libxfs
         - introduction of sysfs interface for XFS
         - bulkstat refactoring
         - demand driven speculative preallocation removal
         - XFS now always requires 64 bit sectors to be configured
         - metadata verifier changes to ensure CRCs are calculated during log
           recovery
         - various minor code cleanups
         - miscellaneous bug fixes
      
        The diffstat is kind of noisy because of the restructuring of the code
        to make kernel/userspace code sharing simpler, along with the XFS wide
        change to use the standard negative error return convention (at last!)"
      
      * tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfs: (45 commits)
        xfs: fix coccinelle warnings
        xfs: flush both inodes in xfs_swap_extents
        xfs: fix swapext ilock deadlock
        xfs: kill xfs_vnode.h
        xfs: kill VN_MAPPED
        xfs: kill VN_CACHED
        xfs: kill VN_DIRTY()
        xfs: dquot recovery needs verifiers
        xfs: quotacheck leaves dquot buffers without verifiers
        xfs: ensure verifiers are attached to recovered buffers
        xfs: catch buffers written without verifiers attached
        xfs: avoid false quotacheck after unclean shutdown
        xfs: fix rounding error of fiemap length parameter
        xfs: introduce xfs_bulkstat_ag_ichunk
        xfs: require 64-bit sector_t
        xfs: fix uflags detection at xfs_fs_rm_xquota
        xfs: remove XFS_IS_OQUOTA_ON macros
        xfs: tidy up xfs_set_inode32
        xfs: allow inode allocations in post-growfs disk space
        xfs: mark xfs_qm_quotacheck as static
        ...
      dc1cc851
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · cec99709
      Linus Torvalds authored
      Pull quota, reiserfs, UDF updates from Jan Kara:
       "Scalability improvements for quota, a few reiserfs fixes, and couple
        of misc cleanups (udf, ext2)"
      
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        reiserfs: Fix use after free in journal teardown
        reiserfs: fix corruption introduced by balance_leaf refactor
        udf: avoid redundant memcpy when writing data in ICB
        fs/udf: re-use hex_asc_upper_{hi,lo} macros
        fs/quota: kernel-doc warning fixes
        udf: use linux/uaccess.h
        fs/ext2/super.c: Drop memory allocation cast
        quota: remove dqptr_sem
        quota: simplify remove_inode_dquot_ref()
        quota: avoid unnecessary dqget()/dqput() calls
        quota: protect Q_GETFMT by dqonoff_mutex
      cec99709
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 8d2d441a
      Linus Torvalds authored
      Pull Ceph updates from Sage Weil:
       "There is a lot of refactoring and hardening of the libceph and rbd
        code here from Ilya that fix various smaller bugs, and a few more
        important fixes with clone overlap.  The main fix is a critical change
        to the request_fn handling to not sleep that was exposed by the recent
        mutex changes (which will also go to the 3.16 stable series).
      
        Yan Zheng has several fixes in here for CephFS fixing ACL handling,
        time stamps, and request resends when the MDS restarts.
      
        Finally, there are a few cleanups from Himangi Saraogi based on
        Coccinelle"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (39 commits)
        libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly
        rbd: remove extra newlines from rbd_warn() messages
        rbd: allocate img_request with GFP_NOIO instead GFP_ATOMIC
        rbd: rework rbd_request_fn()
        ceph: fix kick_requests()
        ceph: fix append mode write
        ceph: fix sizeof(struct tYpO *) typo
        ceph: remove redundant memset(0)
        rbd: take snap_id into account when reading in parent info
        rbd: do not read in parent info before snap context
        rbd: update mapping size only on refresh
        rbd: harden rbd_dev_refresh() and callers a bit
        rbd: split rbd_dev_spec_update() into two functions
        rbd: remove unnecessary asserts in rbd_dev_image_probe()
        rbd: introduce rbd_dev_header_info()
        rbd: show the entire chain of parent images
        ceph: replace comma with a semicolon
        rbd: use rbd_segment_name_free() instead of kfree()
        ceph: check zero length in ceph_sync_read()
        ceph: reset r_resend_mds after receiving -ESTALE
        ...
      8d2d441a
    • Linus Torvalds's avatar
      Merge tag 'upstream-3.17-rc1' of git://git.infradead.org/linux-ubifs · 89838b80
      Linus Torvalds authored
      Pull UBI/UBIFS changes from Artem Bityutskiy:
       "No significant changes, mostly small fixes here and there.  The more
        important fixes are:
      
         - UBI deleted list items while iterating the list with
           'list_for_each_entry'
         - The UBI block driver did not work properly with very large UBI
           volumes"
      
      * tag 'upstream-3.17-rc1' of git://git.infradead.org/linux-ubifs: (21 commits)
        UBIFS: Add log overlap assertions
        Revert "UBIFS: add a log overlap assertion"
        UBI: bugfix in ubi_wl_flush()
        UBI: block: Avoid disk size integer overflow
        UBI: block: Set disk_capacity out of the mutex
        UBI: block: Make ubiblock_resize return something
        UBIFS: add a log overlap assertion
        UBIFS: remove unnecessary check
        UBIFS: remove mst_mutex
        UBIFS: kernel-doc warning fix
        UBI: init_volumes: Ignore volumes with no LEBs
        UBIFS: replace seq_printf by seq_puts
        UBIFS: replace count*size kzalloc by kcalloc
        UBIFS: kernel-doc warning fix
        UBIFS: fix error path in create_default_filesystem()
        UBIFS: fix spelling of "scanned"
        UBIFS: fix some comments
        UBIFS: remove useless @ecc in struct ubifs_scan_leb
        UBIFS: remove useless statements
        UBIFS: Add missing break statements in dbg_chk_pnode()
        ...
      89838b80
    • Aneesh Kumar K.V's avatar
      powerpc/thp: Add tracepoints to track hugepage invalidate · 9e813308
      Aneesh Kumar K.V authored
      Add tracepoint to track hugepage invalidate. This help us
      in debugging difficult to track bugs.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9e813308
    • Aneesh Kumar K.V's avatar
      powerpc/mm: Use read barrier when creating real_pte · 85c1fafd
      Aneesh Kumar K.V authored
      On ppc64 we support 4K hash pte with 64K page size. That requires
      us to track the hash pte slot information on a per 4k basis. We do that
      by storing the slot details in the second half of pte page. The pte bit
      _PAGE_COMBO is used to indicate whether the second half need to be
      looked while building real_pte. We need to use read memory barrier while
      doing that so that load of hidx is not reordered w.r.t _PAGE_COMBO
      check. On the store side we already do a lwsync in __hash_page_4K
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      85c1fafd
    • Aneesh Kumar K.V's avatar
      powerpc/thp: Use ACCESS_ONCE when loading pmdp · 7e467245
      Aneesh Kumar K.V authored
      We would get wrong results in compiler recomputed old_pmd. Avoid
      that by using ACCESS_ONCE
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7e467245
    • Aneesh Kumar K.V's avatar
      powerpc/thp: Invalidate with vpn in loop · 969b7b20
      Aneesh Kumar K.V authored
      As per ISA, for 4k base page size we compare 14..65 bits of VA specified
      with the entry_VA in tlb. That implies we need to make sure we do a
      tlbie with all the possible 4k va we used to access the 16MB hugepage.
      With 64k base page size we compare 14..57 bits of VA. Hence we cannot
      ignore the lower 24 bits of va while tlbie .We also cannot tlb
      invalidate a 16MB entry with just one tlbie instruction because
      we don't track which va was used to instantiate the tlb entry.
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      969b7b20
    • Aneesh Kumar K.V's avatar
      powerpc/thp: Handle combo pages in invalidate · fc047955
      Aneesh Kumar K.V authored
      If we changed base page size of the segment, either via sub_page_protect
      or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
      table entries. We do a lazy hash page table flush for all mapped pages
      in the demoted segment. This happens when we handle hash page fault for
      these pages.
      
      We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
      pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
      that implies that we could possibly have older 64K hash pte entries in
      the hash page table and we need to invalidate those entries.
      
      Use _PAGE_COMBO to determine the page size with which we should
      invalidate the hash table entries on unmap.
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fc047955
    • Aneesh Kumar K.V's avatar
      powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte · 629149fa
      Aneesh Kumar K.V authored
      If we changed base page size of the segment, either via sub_page_protect
      or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
      table entries. We do a lazy hash page table flush for all mapped pages
      in the demoted segment. This happens when we handle hash page fault
      for these pages.
      
      We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
      pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
      that implies that we could possibly have older 64K hash pte entries in
      the hash page table and we need to invalidate those entries.
      
      Handle this correctly for 16M pages
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      629149fa
    • Aneesh Kumar K.V's avatar
      powerpc/thp: Don't recompute vsid and ssize in loop on invalidate · fa1f8ae8
      Aneesh Kumar K.V authored
      The segment identifier and segment size will remain the same in
      the loop, So we can compute it outside. We also change the
      hugepage_invalidate interface so that we can use it the later patch
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fa1f8ae8
    • Aneesh Kumar K.V's avatar
      powerpc/thp: Add write barrier after updating the valid bit · b0aa44a3
      Aneesh Kumar K.V authored
      With hugepages, we store the hpte valid information in the pte page
      whose address is stored in the second half of the PMD. Use a
      write barrier to make sure clearing pmd busy bit and updating
      hpte valid info are ordered properly.
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b0aa44a3
    • Nishanth Aravamudan's avatar
      powerpc: reorder per-cpu NUMA information's initialization · 2fabf084
      Nishanth Aravamudan authored
      There is an issue currently where NUMA information is used on powerpc
      (and possibly ia64) before it has been read from the device-tree, which
      leads to large slab consumption with CONFIG_SLUB and memoryless nodes.
      
      NUMA powerpc non-boot CPU's cpu_to_node/cpu_to_mem is only accurate
      after start_secondary(), similar to ia64, which is invoked via
      smp_init().
      
      Commit 6ee0578b ("workqueue: mark init_workqueues() as
      early_initcall()") made init_workqueues() be invoked via
      do_pre_smp_initcalls(), which is obviously before the secondary
      processors are online.
      
      Additionally, the following commits changed init_workqueues() to use
      cpu_to_node to determine the node to use for kthread_create_on_node:
      
      bce90380 ("workqueue: add wq_numa_tbl_len and
      wq_numa_possible_cpumask[]")
      f3f90ad4 ("workqueue: determine NUMA node of workers accourding to
      the allowed cpumask")
      
      Therefore, when init_workqueues() runs, it sees all CPUs as being on
      Node 0. On LPARs or KVM guests where Node 0 is memoryless, this leads to
      a high number of slab deactivations
      (http://www.spinics.net/lists/linux-mm/msg67489.html).
      
      Fix this by initializing the powerpc-specific CPU<->node/local memory
      node mapping as early as possible, which on powerpc is
      do_init_bootmem(). Currently that function initializes the mapping for
      the boot CPU, but we extend it to setup the mapping for all possible
      CPUs. Then, in smp_prepare_cpus(), we can correspondingly set the
      per-cpu values for all possible CPUs. That ensures that before the
      early_initcalls run (and really as early as possible), the per-cpu NUMA
      mapping is accurate.
      
      While testing memoryless nodes on PowerKVM guests with a fix to the
      workqueue logic to use cpu_to_mem() instead of cpu_to_node(), with a
      guest topology of:
      
      available: 2 nodes (0-1)
      node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
      node 0 size: 0 MB
      node 0 free: 0 MB
      node 1 cpus: 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
      node 1 size: 16336 MB
      node 1 free: 15329 MB
      node distances:
      node   0   1
        0:  10  40
        1:  40  10
      
      the slab consumption decreases from
      
      Slab:             932416 kB
      SUnreclaim:       902336 kB
      
      to
      
      Slab:             395264 kB
      SUnreclaim:       359424 kB
      
      And we a corresponding increase in the slab efficiency from
      
      slab                                   mem     objs    slabs
                                            used   active   active
      ------------------------------------------------------------
      kmalloc-16384                       337 MB   11.28%  100.00%
      task_struct                         288 MB    9.93%  100.00%
      
      to
      
      slab                                   mem     objs    slabs
                                            used   active   active
      ------------------------------------------------------------
      kmalloc-16384                        37 MB  100.00%  100.00%
      task_struct                          31 MB  100.00%  100.00%
      
      Powerpc didn't support memoryless nodes until recently (64bb80d8
      "powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES" and 8c272261
      "powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"). Those commits also
      helped improve memory consumption with these kind of environments.
      Signed-off-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2fabf084