1. 03 Feb, 2009 1 commit
    • Eric Dumazet's avatar
      modules: Use a better scheme for refcounting · 720eba31
      Eric Dumazet authored
      Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
      using a lot of memory.
      
      Each 'struct module' contains an [NR_CPUS] array of full cache lines.
      
      This patch uses existing infrastructure (percpu_modalloc() &
      percpu_modfree()) to allocate percpu space for the refcount storage.
      
      Instead of wasting NR_CPUS*128 bytes (on i386), we now use
      nr_cpu_ids*sizeof(local_t) bytes.
      
      On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
      size of module files by about 2 Mbytes. (1Kb per module)
      
      Instead of having all refcounters in the same memory node - with TLB misses
      because of vmalloc() - this new implementation permits to have better
      NUMA properties, since each  CPU will use storage on its preferred node,
      thanks to percpu storage.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      720eba31
  2. 01 Feb, 2009 1 commit
    • Linus Torvalds's avatar
      Manually revert "mlock: downgrade mmap sem while populating mlocked regions" · 27421e21
      Linus Torvalds authored
      This essentially reverts commit 8edb08ca.
      
      It downgraded our mmap semaphore to a read-lock while mlocking pages, in
      order to allow other threads (and external accesses like "ps" et al) to
      walk the vma lists and take page faults etc.  Which is a nice idea, but
      the implementation does not work.
      
      Because we cannot upgrade the lock back to a write lock without
      releasing the mmap semaphore, the code had to release the lock entirely
      and then re-take it as a writelock.  However, that meant that the caller
      possibly lost the vma chain that it was following, since now another
      thread could come in and mmap/munmap the range.
      
      The code tried to work around that by just looking up the vma again and
      erroring out if that happened, but quite frankly, that was just a buggy
      hack that doesn't actually protect against anything (the other thread
      could just have replaced the vma with another one instead of totally
      unmapping it).
      
      The only way to downgrade to a read map _reliably_ is to do it at the
      end, which is likely the right thing to do: do all the 'vma' operations
      with the write-lock held, then downgrade to a read after completing them
      all, and then do the "populate the newly mlocked regions" while holding
      just the read lock.  And then just drop the read-lock and return to user
      space.
      
      The (perhaps somewhat simpler) alternative is to just make all the
      callers of mlock_vma_pages_range() know that the mmap lock got dropped,
      and just re-grab the mmap semaphore if it needs to mlock more than one
      vma region.
      
      So we can do this "downgrade mmap sem while populating mlocked regions"
      thing right, but the way it was done here was absolutely not correct.
      Thus the revert, in the expectation that we will do it all correctly
      some day.
      
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27421e21
  3. 31 Jan, 2009 15 commits
  4. 30 Jan, 2009 23 commits