1. 08 Feb, 2014 6 commits
    • Joe Perches's avatar
      slab: Make allocations with GFP_ZERO slightly more efficient · 5087c822
      Joe Perches authored
      Use the likely mechanism already around valid
      pointer tests to better choose when to memset
      to 0 allocations with __GFP_ZERO
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      5087c822
    • Joonsoo Kim's avatar
      slab: make more slab management structure off the slab · 8fc9cf42
      Joonsoo Kim authored
      Now, the size of the freelist for the slab management diminish,
      so that the on-slab management structure can waste large space
      if the object of the slab is large.
      
      Consider a 128 byte sized slab. If on-slab is used, 31 objects can be
      in the slab. The size of the freelist for this case would be 31 bytes
      so that 97 bytes, that is, more than 75% of object size, are wasted.
      
      In a 64 byte sized slab case, no space is wasted if we use on-slab.
      So set off-slab determining constraint to 128 bytes.
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      8fc9cf42
    • Joonsoo Kim's avatar
      slab: introduce byte sized index for the freelist of a slab · a41adfaa
      Joonsoo Kim authored
      Currently, the freelist of a slab consist of unsigned int sized indexes.
      Since most of slabs have less number of objects than 256, large sized
      indexes is needless. For example, consider the minimum kmalloc slab. It's
      object size is 32 byte and it would consist of one page, so 256 indexes
      through byte sized index are enough to contain all possible indexes.
      
      There can be some slabs whose object size is 8 byte. We cannot handle
      this case with byte sized index, so we need to restrict minimum
      object size. Since these slabs are not major, wasted memory from these
      slabs would be negligible.
      
      Some architectures' page size isn't 4096 bytes and rather larger than
      4096 bytes (One example is 64KB page size on PPC or IA64) so that
      byte sized index doesn't fit to them. In this case, we will use
      two bytes sized index.
      
      Below is some number for this patch.
      
      * Before *
      kmalloc-512          525    640    512    8    1 : tunables   54   27    0 : slabdata     80     80      0
      kmalloc-256          210    210    256   15    1 : tunables  120   60    0 : slabdata     14     14      0
      kmalloc-192         1016   1040    192   20    1 : tunables  120   60    0 : slabdata     52     52      0
      kmalloc-96           560    620    128   31    1 : tunables  120   60    0 : slabdata     20     20      0
      kmalloc-64          2148   2280     64   60    1 : tunables  120   60    0 : slabdata     38     38      0
      kmalloc-128          647    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
      kmalloc-32         11360  11413     32  113    1 : tunables  120   60    0 : slabdata    101    101      0
      kmem_cache           197    200    192   20    1 : tunables  120   60    0 : slabdata     10     10      0
      
      * After *
      kmalloc-512          521    648    512    8    1 : tunables   54   27    0 : slabdata     81     81      0
      kmalloc-256          208    208    256   16    1 : tunables  120   60    0 : slabdata     13     13      0
      kmalloc-192         1029   1029    192   21    1 : tunables  120   60    0 : slabdata     49     49      0
      kmalloc-96           529    589    128   31    1 : tunables  120   60    0 : slabdata     19     19      0
      kmalloc-64          2142   2142     64   63    1 : tunables  120   60    0 : slabdata     34     34      0
      kmalloc-128          660    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
      kmalloc-32         11716  11780     32  124    1 : tunables  120   60    0 : slabdata     95     95      0
      kmem_cache           197    210    192   21    1 : tunables  120   60    0 : slabdata     10     10      0
      
      kmem_caches consisting of objects less than or equal to 256 byte have
      one or more objects than before. In the case of kmalloc-32, we have 11 more
      objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of
      memory. Of couse, this percentage decreases as the number of objects
      in a slab decreases.
      
      Here are the performance results on my 4 cpus machine.
      
      * Before *
      
       Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
      
             229,945,138 cache-misses                                                  ( +-  0.23% )
      
            11.627897174 seconds time elapsed                                          ( +-  0.14% )
      
      * After *
      
       Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
      
             218,640,472 cache-misses                                                  ( +-  0.42% )
      
            11.504999837 seconds time elapsed                                          ( +-  0.21% )
      
      cache-misses are reduced by this patchset, roughly 5%.
      And elapsed times are improved by 1%.
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      a41adfaa
    • Joonsoo Kim's avatar
      slab: restrict the number of objects in a slab · f315e3fa
      Joonsoo Kim authored
      To prepare to implement byte sized index for managing the freelist
      of a slab, we should restrict the number of objects in a slab to be less
      or equal to 256, since byte only represent 256 different values.
      Setting the size of object to value equal or more than newly introduced
      SLAB_OBJ_MIN_SIZE ensures that the number of objects in a slab is less or
      equal to 256 for a slab with 1 page.
      
      If page size is rather larger than 4096, above assumption would be wrong.
      In this case, we would fall back on 2 bytes sized index.
      
      If minimum size of kmalloc is less than 16, we use it as minimum object
      size and give up this optimization.
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      f315e3fa
    • Joonsoo Kim's avatar
      slab: introduce helper functions to get/set free object · e5c58dfd
      Joonsoo Kim authored
      In the following patches, to get/set free objects from the freelist
      is changed so that simple casting doesn't work for it. Therefore,
      introduce helper functions.
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      e5c58dfd
    • Joonsoo Kim's avatar
      slab: factor out calculate nr objects in cache_estimate · 9cef2e2b
      Joonsoo Kim authored
      This logic is not simple to understand so that making separate function
      helping readability. Additionally, we can use this change in the
      following patch which implement for freelist to have another sized index
      in according to nr objects.
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      9cef2e2b
  2. 03 Feb, 2014 4 commits
    • Linus Torvalds's avatar
      Linus 3.14-rc1 · 38dbfb59
      Linus Torvalds authored
      38dbfb59
    • Linus Torvalds's avatar
      Merge branch 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 69048e01
      Linus Torvalds authored
      Pull parisc updates from Helge Deller:
       "The three major changes in this patchset is a implementation for
        flexible userspace memory maps, cache-flushing fixes (again), and a
        long-discussed ABI change to make EWOULDBLOCK the same value as
        EAGAIN.
      
        parisc has been the only platform where we had EWOULDBLOCK != EAGAIN
        to keep HP-UX compatibility.  Since we will probably never implement
        full HP-UX support, we prefer to drop this compatibility to make it
        easier for us with Linux userspace programs which mostly never checked
        for both values.  We don't expect major fall-outs because of this
        change, and if we face some, we will simply rebuild the necessary
        applications in the debian archives"
      
      * 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: add flexible mmap memory layout support
        parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc
        parisc: convert uapi/asm/stat.h to use native types only
        parisc: wire up sched_setattr and sched_getattr
        parisc: fix cache-flushing
        parisc/sti_console: prefer Linux fonts over built-in ROM fonts
      69048e01
    • Mikulas Patocka's avatar
      hpfs: optimize quad buffer loading · 1c0b8a7a
      Mikulas Patocka authored
      HPFS needs to load 4 consecutive 512-byte sectors when accessing the
      directory nodes or bitmaps.  We can't switch to 2048-byte block size
      because files are allocated in the units of 512-byte sectors.
      
      Previously, the driver would allocate a 2048-byte area using kmalloc,
      copy the data from four buffers to this area and eventually copy them
      back if they were modified.
      
      In the current implementation of the buffer cache, buffers are allocated
      in the pagecache.  That means that 4 consecutive 512-byte buffers are
      stored in consecutive areas in the kernel address space.  So, we don't
      need to allocate extra memory and copy the content of the buffers there.
      
      This patch optimizes the code to avoid copying the buffers.  It checks
      if the four buffers are stored in contiguous memory - if they are not,
      it falls back to allocating a 2048-byte area and copying data there.
      Signed-off-by: default avatarMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1c0b8a7a
    • Mikulas Patocka's avatar
      hpfs: remember free space · 2cbe5c76
      Mikulas Patocka authored
      Previously, hpfs scanned all bitmaps each time the user asked for free
      space using statfs.  This patch changes it so that hpfs scans the
      bitmaps only once, remembes the free space and on next invocation of
      statfs it returns the value instantly.
      
      New versions of wine are hammering on the statfs syscall very heavily,
      making some games unplayable when they're stored on hpfs, with load
      times in minutes.
      
      This should be backported to the stable kernels because it fixes
      user-visible problem (excessive level load times in wine).
      Signed-off-by: default avatarMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2cbe5c76
  3. 02 Feb, 2014 12 commits
  4. 01 Feb, 2014 12 commits
  5. 31 Jan, 2014 6 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 8a1f006a
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Highlights:
      
         - Fix several races in nfs_revalidate_mapping
         - NFSv4.1 slot leakage in the pNFS files driver
         - Stable fix for a slot leak in nfs40_sequence_done
         - Don't reject NFSv4 servers that support ACLs with only ALLOW aces"
      
      * tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        nfs: initialize the ACL support bits to zero.
        NFSv4.1: Cleanup
        NFSv4.1: Clean up nfs41_sequence_done
        NFSv4: Fix a slot leak in nfs40_sequence_done
        NFSv4.1 free slot before resending I/O to MDS
        nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING
        NFS: Fix races in nfs_revalidate_mapping
        sunrpc: turn warn_gssd() log message into a dprintk()
        NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping
        nfs: handle servers that support only ALLOW ACE type.
      8a1f006a
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 14864a52
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "The big chunks here are the updates for oxygen driver for Xonar DG
        devices, which were slipped from the previous pull request.  They are
        device-specific and thus not too dangerous.
      
        Other than that, all patches are small bug fixes, mainly for Samsung
        build fixes, a few HD-audio enhancements, and other misc ASoC fixes.
        (And this time ASoC merge is less than Octopus, lucky seven :)"
      
      * tag 'sound-fix-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (42 commits)
        ALSA: hda/hdmi - allow PIN_OUT to be dynamically enabled
        ALSA: hda - add headset mic detect quirks for another Dell laptop
        ALSA: oxygen: Xonar DG(X): cleanup and minor changes
        ALSA: oxygen: Xonar DG(X): modify high-pass filter control
        ALSA: oxygen: Xonar DG(X): modify input select functions
        ALSA: oxygen: Xonar DG(X): modify capture volume functions
        ALSA: oxygen: Xonar DG(X): use headphone volume control
        ALSA: oxygen: Xonar DG(X): modify playback output select
        ALSA: oxygen: Xonar DG(X): capture from I2S channel 1, not 2
        ALSA: oxygen: Xonar DG(X): move the mixer code into another file
        ALSA: oxygen: modify CS4245 register dumping function
        ALSA: oxygen: modify adjust_dg_dac_routing function
        ALSA: oxygen: Xonar DG(X): modify DAC/ADC parameters function
        ALSA: oxygen: Xonar DG(X): modify initialization functions
        ALSA: oxygen: Xonar DG(X): add new CS4245 SPI functions
        ALSA: oxygen: additional definitions for the Xonar DG/DGX card
        ALSA: oxygen: change description of the xonar_dg.c file
        ALSA: oxygen: export oxygen_update_dac_routing symbol
        ALSA: oxygen: add mute mask for the OXYGEN_PLAY_ROUTING register
        ALSA: oxygen: modify the SPI writing function
        ...
      14864a52
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 4e13c5d0
      Linus Torvalds authored
      Pull SCSI target updates from Nicholas Bellinger:
       "The highlights this round include:
      
        - add support for SCSI Referrals (Hannes)
        - add support for T10 DIF into target core (nab + mkp)
        - add support for T10 DIF emulation in FILEIO + RAMDISK backends (Sagi + nab)
        - add support for T10 DIF -> bio_integrity passthrough in IBLOCK backend (nab)
        - prep changes to iser-target for >= v3.15 T10 DIF support (Sagi)
        - add support for qla2xxx N_Port ID Virtualization - NPIV (Saurav + Quinn)
        - allow percpu_ida_alloc() to receive task state bitmask (Kent)
        - fix >= v3.12 iscsi-target session reset hung task regression (nab)
        - fix >= v3.13 percpu_ref se_lun->lun_ref_active race (nab)
        - fix a long-standing network portal creation race (Andy)"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits)
        target: Fix percpu_ref_put race in transport_lun_remove_cmd
        target/iscsi: Fix network portal creation race
        target: Report bad sector in sense data for DIF errors
        iscsi-target: Convert gfp_t parameter to task state bitmask
        iscsi-target: Fix connection reset hang with percpu_ida_alloc
        percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask
        iscsi-target: Pre-allocate more tags to avoid ack starvation
        qla2xxx: Configure NPIV fc_vport via tcm_qla2xxx_npiv_make_lport
        qla2xxx: Enhancements to enable NPIV support for QLOGIC ISPs with TCM/LIO.
        qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure
        IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine
        IB/isert: Move fastreg descriptor creation to a function
        IB/isert: Avoid frwr notation, user fastreg
        IB/isert: seperate connection protection domains and dma MRs
        tcm_loop: Enable DIF/DIX modes in SCSI host LLD
        target/rd: Add DIF protection into rd_execute_rw
        target/rd: Add support for protection SGL setup + release
        target/rd: Refactor rd_build_device_space + rd_release_device_space
        target/file: Add DIF protection support to fd_execute_rw
        target/file: Add DIF protection init/format support
        ...
      4e13c5d0
    • Lorenzo Pieralisi's avatar
      drivers: bus: fix CCI driver kcalloc call parameters swap · 7c762036
      Lorenzo Pieralisi authored
      This patch fixes a bug/typo in the CCI driver kcalloc usage
      that inadvertently swapped the parameters order in the
      kcalloc call and went unnoticed.
      Reported-by: default avatarXia Feng <xiafeng@allwinnertech.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      7c762036
    • Tim Kryger's avatar
      ARM: dts: bcm28155-ap: Fix Card Detection GPIO · 94db37ad
      Tim Kryger authored
      The board schematic states that the "SD_CARD_DET_N gets pulled to GND
      when card is inserted" so the polarity has been updated to active low.
      
      Polarity is now specified with a GPIO define instead of a magic number.
      Signed-off-by: default avatarTim Kryger <tim.kryger@linaro.org>
      Reviewed-by: default avatarMatt Porter <matt.porter@linaro.org>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      94db37ad
    • Olof Johansson's avatar
      Merge tag 'renesas-dt-fixes2-for-v3.14' of... · a00928f5
      Olof Johansson authored
      Merge tag 'renesas-dt-fixes2-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
      
      Second Round of Renesas ARM Based SoC DT Fixes for v3.14
      
      Correct i2c clock references for r8a7790 (R-Car H2) SoC
      
      The error was introduced in 72197ca7 ("ARM: shmobile: r8a7790:
      Reference clocks") which is queued up for v3.14.
      
      * tag 'renesas-dt-fixes2-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
        ARM: shmobile: r8a7790.dtsi: ficx i2c[0-3] clock reference
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      a00928f5