1. 14 Feb, 2015 19 commits
    • Tejun Heo's avatar
      cpumask: always use nr_cpu_ids in formatting and parsing functions · 513e3d2d
      Tejun Heo authored
      bitmap implements two variants of scnprintf functions to format a bitmap
      into a string and cpumask and nodemask wrap them to provide equivalent
      interfaces.  The scnprintf family of functions require a string buffer as
      an output target which complicates code paths which just want to print out
      the mask through printk for informational or debug purposes as they have
      to worry about how large the buffer should be and whether it's too large
      to allocate on stack.
      
      Neither cpumask or nodemask provides a guildeline on how large the target
      buffer should be forcing users come up with their own solutions - some
      allocate an arbitrarily sized buffer which is small enough to allocate on
      stack but may be too short in corner cases, other come up with a custom
      upper limit calculation considering the output format, some allocate the
      buffer dynamically while one resorted to using lock to synchronize access
      to a static buffer.
      
      This is an artificial problem which is being solved repeatedly for no
      benefit.  In a lot of cases, the output area already exists and can be
      targeted directly making the intermediate buffer unnecessary.  This
      patchset teaches printf family of functions how to format bitmaps and
      replace the dedicated formatting functions with it.
      
      Pointer formatting is extended to cover bitmap formatting.  It uses the
      field width for the number of bits instead of precision.  The format used
      is '%*pb[l]', with the optional trailing 'l' specifying list format
      instead of hex masks.  For more details, please see 0002.
      
      This patch (of 31):
      
      Currently, the formatting and parsing functions in cpumask.h use
      nr_cpumask_bits like other cpumask functions; however, nr_cpumask_bits
      is either NR_CPUS or nr_cpu_ids depending on CONFIG_CPUMASK_OFFSTACK.
      This leads to inconsistent behaviors.
      
      With CONFIG_NR_CPUS=512 and !CONFIG_CPUMASK_OFFSTACK
      
        # cat /sys/devices/virtual/net/lo/queues/rx-0/rps_cpus
        00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000
        # cat /proc/self/status | grep Cpus_allowed:
        Cpus_allowed:   f
      
      With CONFIG_NR_CPUS=1024 and CONFIG_CPUMASK_OFFSTACK (fedora default)
      
        # cat /sys/devices/virtual/net/lo/queues/rx-0/rps_cpus
        0
        # cat /proc/self/status | grep Cpus_allowed:
        Cpus_allowed:   f
      
      Note that /proc/self/status is always using nr_cpu_ids regardless of
      config.  This is because seq cpumask formattings functions always use
      nr_cpu_ids.
      
      Given that the same output fields may switch between the two forms,
      converging on nr_cpu_ids always isn't too likely to surprise userland.
      This patch updates the formatting and parsing functions in cpumask.h
      to always use nr_cpu_ids.  There's no point in dealing with CPUs which
      aren't even possible on the machine.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: "John W. Linville" <linville@tuxdriver.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      513e3d2d
    • Jan Kara's avatar
      lib/genalloc.c: check result of devres_alloc() · 310ee9e8
      Jan Kara authored
      devm_gen_pool_create() calls devres_alloc() and dereferences its result
      without checking whether devres_alloc() succeeded.  Check for error and
      bail out if it happened.
      
      Coverity-id 1016493.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      310ee9e8
    • Rasmus Villemoes's avatar
      lib/string.c: improve strrchr() · 8da53d45
      Rasmus Villemoes authored
      Instead of potentially passing over the string twice in case c is not
      found, just keep track of the last occurrence.  According to
      bloat-o-meter, this also cuts the generated code by a third (54 vs 36
      bytes).  Oh, and we get rid of those 7-space indented lines.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8da53d45
    • Andrzej Hajda's avatar
      fs/namespace: convert devname allocation to kstrdup_const · fcc139ae
      Andrzej Hajda authored
      VFS frequently performs duplication of strings located in read-only memory
      section.  Replacing kstrdup by kstrdup_const allows to avoid such
      operations.
      Signed-off-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Mike Turquette <mturquette@linaro.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fcc139ae
    • Andrzej Hajda's avatar
      mm/slab: convert cache name allocations to kstrdup_const · 3dec16ea
      Andrzej Hajda authored
      slab frequently performs duplication of strings located in read-only
      memory section.  Replacing kstrdup by kstrdup_const allows to avoid such
      operations.
      
      [akpm@linux-foundation.org: make the handling of kmem_cache.name const-correct]
      Signed-off-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Mike Turquette <mturquette@linaro.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3dec16ea
    • Andrzej Hajda's avatar
      clk: convert clock name allocations to kstrdup_const · 612936f2
      Andrzej Hajda authored
      Clock subsystem frequently performs duplication of strings located in
      read-only memory section.  Replacing kstrdup by kstrdup_const allows to
      avoid such operations.
      Signed-off-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Mike Turquette <mturquette@linaro.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      612936f2
    • Tejun Heo's avatar
      kernfs: remove KERNFS_STATIC_NAME · dfeb0750
      Tejun Heo authored
      When a new kernfs node is created, KERNFS_STATIC_NAME is used to avoid
      making a separate copy of its name.  It's currently only used for sysfs
      attributes whose filenames are required to stay accessible and unchanged.
      There are rare exceptions where these names are allocated and formatted
      dynamically but for the vast majority of cases they're consts in the
      rodata section.
      
      Now that kernfs is converted to use kstrdup_const() and kfree_const(),
      there's little point in keeping KERNFS_STATIC_NAME around.  Remove it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Andrzej Hajda <a.hajda@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dfeb0750
    • Andrzej Hajda's avatar
      kernfs: convert node name allocation to kstrdup_const · 75287a67
      Andrzej Hajda authored
      sysfs frequently performs duplication of strings located in read-only
      memory section.  Replacing kstrdup by kstrdup_const allows to avoid such
      operations.
      Signed-off-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Mike Turquette <mturquette@linaro.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      75287a67
    • Andrzej Hajda's avatar
      mm/util: add kstrdup_const · a4bb1e43
      Andrzej Hajda authored
      kstrdup() is often used to duplicate strings where neither source neither
      destination will be ever modified.  In such case we can just reuse the
      source instead of duplicating it.  The problem is that we must be sure
      that the source is non-modifiable and its life-time is long enough.
      
      I suspect the good candidates for such strings are strings located in
      kernel .rodata section, they cannot be modifed because the section is
      read-only and their life-time is equal to kernel life-time.
      
      This small patchset proposes alternative version of kstrdup -
      kstrdup_const, which returns source string if it is located in .rodata
      otherwise it fallbacks to kstrdup.  To verify if the source is in
      .rodata function checks if the address is between sentinels
      __start_rodata, __end_rodata.  I guess it should work with all
      architectures.
      
      The main patch is accompanied by four patches constifying kstrdup for
      cases where situtation described above happens frequently.
      
      I have tested the patchset on mobile platform (exynos4210-trats) and it
      saves 3272 string allocations.  Since minimal allocation is 32 or 64
      bytes depending on Kconfig options the patchset saves respectively about
      100KB or 200KB of memory.
      
      Stats from tested platform show that the main offender is sysfs:
      
      By caller:
        2260 __kernfs_new_node
          631 clk_register+0xc8/0x1b8
          318 clk_register+0x34/0x1b8
            51 kmem_cache_create
            12 alloc_vfsmnt
      
      By string (with count >= 5):
          883 power
          876 subsystem
          135 parameters
          132 device
           61 iommu_group
          ...
      
      This patch (of 5):
      
      Add an alternative version of kstrdup which returns pointer to constant
      char array.  The function checks if input string is in persistent and
      read-only memory section, if yes it returns the input string, otherwise it
      fallbacks to kstrdup.
      
      kstrdup_const is accompanied by kfree_const performing conditional memory
      deallocation of the string.
      Signed-off-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Mike Turquette <mturquette@linaro.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a4bb1e43
    • Daniel Borkmann's avatar
      lib: crc32: constify crc32 lookup table · f5e38b92
      Daniel Borkmann authored
      Commit 8f243af4 ("sections: fix const sections for crc32 table")
      removed the compile-time generated crc32 tables from the RO sections,
      because it conflicts with the definition of __cacheline_aligned which
      puts all such aligned data into .data..cacheline_aligned section
      optimized for wasting less space, and can cause alignment issues when
      used in combination with const with some gcc versions like 4.7.0 due to
      a gcc bug [1].
      
      Given that most gcc versions should have the fix by now, we can just use
      ____cacheline_aligned, which only aligns the data but doesn't move it
      into specific sections as opposed to __cacheline_aligned.  In case of
      gcc versions having the mentioned bug, the alignment attribute will have
      no effect, but the data will still be made RO.
      
      After patch tables are in RO:
      
        $ nm -v lib/crc32.o | grep -1 -E "crc32c?table"
        0000000000000000 t arch_local_irq_enable
        0000000000000000 r crc32ctable_le
        0000000000000000 t crc32_exit
        --
        0000000000000960 t test_buf
        0000000000002000 r crc32table_be
        0000000000004000 r crc32table_le
        000000001d1056e5 A __crc_crc32_be
      
        [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52181Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Joe Mario <jmario@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f5e38b92
    • Rasmus Villemoes's avatar
      lib: bitmap: remove redundant code from __bitmap_shift_left · 7f590657
      Rasmus Villemoes authored
      The first of these conditionals is completely redundant: If k == lim-1, we
      must have off==0, so the second conditional will also trigger and then it
      wouldn't matter if upper had some high bits set.  But the second
      conditional is in fact also redundant, since it only serves to clear out
      some high-order "don't care" bits of dst, about which no guarantee is
      made.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7f590657
    • Rasmus Villemoes's avatar
      lib: bitmap: eliminate branch in __bitmap_shift_left · 6d874eca
      Rasmus Villemoes authored
      We can shift the bits from lower and upper into place before assembling
      dst[k + off]; moving the shift of lower into the branch where we already
      know that rem is non-zero allows us to remove a conditional.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d874eca
    • Rasmus Villemoes's avatar
      lib: bitmap: change bitmap_shift_left to take unsigned parameters · dba94c25
      Rasmus Villemoes authored
      gcc can generate slightly better code for stuff like "nbits %
      BITS_PER_LONG" when it knows nbits is not negative.  Since negative size
      bitmaps or shift amounts don't make sense, change these parameters of
      bitmap_shift_right to unsigned.
      
      If off >= lim (which requires shift >= nbits), k is initialized with a
      large positive value, but since I've let k continue to be signed, the loop
      will never run and dst will be zeroed as expected.  Inside the loop, k is
      guaranteed to be non-negative, so the fact that it is promoted to unsigned
      in the various expressions it appears in is harmless.
      
      Also use "shift" and "nbits" consistently for the parameter names.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dba94c25
    • Rasmus Villemoes's avatar
      lib: bitmap: yet another simplification in __bitmap_shift_right · cfac1d08
      Rasmus Villemoes authored
      If left is 0, we can just let mask be ~0UL, so that anding with it is a
      no-op.  Conveniently, BITMAP_LAST_WORD_MASK provides precisely what we
      need, and we can eliminate left.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cfac1d08
    • Rasmus Villemoes's avatar
      lib: bitmap: remove redundant code from __bitmap_shift_right · 97fb8e94
      Rasmus Villemoes authored
      If the condition k==lim-1 is true, we must have off == 0 (otherwise, k
      could never become that big).  But in that case we have upper == 0 and
      hence dst[k] == (src[k] & mask) >> rem.  Since mask consists of a
      consecutive range of bits starting from the LSB, anding dst[k] with mask
      is a no-op.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      97fb8e94
    • Rasmus Villemoes's avatar
      lib: bitmap: eliminate branch in __bitmap_shift_right · 9d8a6b2a
      Rasmus Villemoes authored
      We can shift the bits from lower and upper into place before assembling
      dst[k]; moving the shift of upper into the branch where we already know
      that rem is non-zero allows us to remove a conditional.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d8a6b2a
    • Rasmus Villemoes's avatar
      lib: bitmap: change bitmap_shift_right to take unsigned parameters · 2fbad299
      Rasmus Villemoes authored
      I've previously changed the nbits parameter of most bitmap_* functions to
      unsigned; now it is bitmap_shift_{left,right}'s turn.  This alone saves
      some .text, but while at it I found that there were a few other things one
      could do.  The end result of these seven patches is
      
        $ scripts/bloat-o-meter /tmp/bitmap.o.{old,new}
        add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-328 (-328)
        function                                     old     new   delta
        __bitmap_shift_right                         384     226    -158
        __bitmap_shift_left                          306     136    -170
      
      and less importantly also a smaller stack footprint
      
        $ stack-o-meter.pl master bitmap
        file                 function                       old  new  delta
        lib/bitmap.o         __bitmap_shift_right             24    8  -16
        lib/bitmap.o         __bitmap_shift_left              24    0  -24
      
      For each pair of 0 <= shift <= nbits <= 256 I've tested the end result
      with a few randomly filled src buffers (including garbage beyond nbits),
      in each case verifying that the shift {left,right}-most bits of dst are
      zero and the remaining nbits-shift bits correspond to src, so I'm fairly
      confident I didn't screw up.  That hasn't stopped me from being wrong
      before, though.
      
      This patch (of 7):
      
      gcc can generate slightly better code for stuff like "nbits %
      BITS_PER_LONG" when it knows nbits is not negative.  Since negative size
      bitmaps or shift amounts don't make sense, change these parameters of
      bitmap_shift_right to unsigned.
      
      The expressions involving "lim - 1" are still ok, since if lim is 0 the
      loop is never executed.
      
      Also use "shift" and "nbits" consistently for the parameter names.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2fbad299
    • Rasmus Villemoes's avatar
      lib/bitmap.c: elide bitmap_copy_le on little-endian · e8f24278
      Rasmus Villemoes authored
      On little-endian, there's no reason to have an extra, presumably less
      efficient, way of copying a bitmap.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e8f24278
    • Rasmus Villemoes's avatar
      lib/bitmap.c: change prototype of bitmap_copy_le · 9b6c2d2e
      Rasmus Villemoes authored
      Make the prototype of bitmap_copy_le the same as bitmap_copy's.  All other
      bitmap_* functions take unsigned long* parameters; there's no reason this
      should be special.
      
      The only current user is the static inline uwb_mas_bm_copy_le, which
      already does the void* laundering, so the end users can pass their u8 or
      __le32 buffers without a cast.
      
      Furthermore, this allows us to simply let bitmap_copy_le be an alias for
      bitmap_copy on little-endian; see next patch.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9b6c2d2e
  2. 13 Feb, 2015 21 commits