1. 30 Dec, 2011 4 commits
    • Hillf Danton's avatar
      mm: hugetlb: fix non-atomic enqueue of huge page · b0365c8d
      Hillf Danton authored
      If a huge page is enqueued under the protection of hugetlb_lock, then the
      operation is atomic and safe.
      Signed-off-by: default avatarHillf Danton <dhillf@gmail.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>		[2.6.37+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0365c8d
    • Andreas Schwab's avatar
      procfs: do not confuse jiffies with cputime64_t · 34845636
      Andreas Schwab authored
      Commit 2a95ea6c ("procfs: do not overflow get_{idle,iowait}_time
      for nohz") did not take into account that one some architectures jiffies
      and cputime use different units.
      
      This causes get_idle_time() to return numbers in the wrong units, making
      the idle time fields in /proc/stat wrong.
      
      Instead of converting the usec value returned by
      get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
      usecs_to_cputime64 to convert it to the correct unit of cputime64_t.
      Signed-off-by: default avatarAndreas Schwab <schwab@linux-m68k.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "Artem S. Tashkinov" <t.artem@mailcity.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      34845636
    • KOSAKI Motohiro's avatar
      mm/mempolicy.c: refix mbind_range() vma issue · e26a5114
      KOSAKI Motohiro authored
      commit 8aacc9f5 ("mm/mempolicy.c: fix pgoff in mbind vma merge") is the
      slightly incorrect fix.
      
      Why? Think following case.
      
      1. map 4 pages of a file at offset 0
      
         [0123]
      
      2. map 2 pages just after the first mapping of the same file but with
         page offset 2
      
         [0123][23]
      
      3. mbind() 2 pages from the first mapping at offset 2.
         mbind_range() should treat new vma is,
      
         [0123][23]
           |23|
           mbind vma
      
         but it does
      
         [0123][23]
           |01|
           mbind vma
      
         Oops. then, it makes wrong vma merge and splitting ([01][0123] or similar).
      
      This patch fixes it.
      
      [testcase]
        test result - before the patch
      
      	case4: 126: test failed. expect '2,4', actual '2,2,2'
             	case5: passed
      	case6: passed
      	case7: passed
      	case8: passed
      	case_n: 246: test failed. expect '4,2', actual '1,4'
      
      	------------[ cut here ]------------
      	kernel BUG at mm/filemap.c:135!
      	invalid opcode: 0000 [#4] SMP DEBUG_PAGEALLOC
      
      	(snip long bug on messages)
      
        test result - after the patch
      
      	case4: passed
             	case5: passed
      	case6: passed
      	case7: passed
      	case8: passed
      	case_n: passed
      
        source:  mbind_vma_test.c
      ============================================================
       #include <numaif.h>
       #include <numa.h>
       #include <sys/mman.h>
       #include <stdio.h>
       #include <unistd.h>
       #include <stdlib.h>
       #include <string.h>
      
      static unsigned long pagesize;
      void* mmap_addr;
      struct bitmask *nmask;
      char buf[1024];
      FILE *file;
      char retbuf[10240] = "";
      int mapped_fd;
      
      char *rubysrc = "ruby -e '\
        pid = %d; \
        vstart = 0x%llx; \
        vend = 0x%llx; \
        s = `pmap -q #{pid}`; \
        rary = []; \
        s.each_line {|line|; \
          ary=line.split(\" \"); \
          addr = ary[0].to_i(16); \
          if(vstart <= addr && addr < vend) then \
            rary.push(ary[1].to_i()/4); \
          end; \
        }; \
        print rary.join(\",\"); \
      '";
      
      void init(void)
      {
      	void* addr;
      	char buf[128];
      
      	nmask = numa_allocate_nodemask();
      	numa_bitmask_setbit(nmask, 0);
      
      	pagesize = getpagesize();
      
      	sprintf(buf, "%s", "mbind_vma_XXXXXX");
      	mapped_fd = mkstemp(buf);
      	if (mapped_fd == -1)
      		perror("mkstemp "), exit(1);
      	unlink(buf);
      
      	if (lseek(mapped_fd, pagesize*8, SEEK_SET) < 0)
      		perror("lseek "), exit(1);
      	if (write(mapped_fd, "\0", 1) < 0)
      		perror("write "), exit(1);
      
      	addr = mmap(NULL, pagesize*8, PROT_NONE,
      		    MAP_SHARED, mapped_fd, 0);
      	if (addr == MAP_FAILED)
      		perror("mmap "), exit(1);
      
      	if (mprotect(addr+pagesize, pagesize*6, PROT_READ|PROT_WRITE) < 0)
      		perror("mprotect "), exit(1);
      
      	mmap_addr = addr + pagesize;
      
      	/* make page populate */
      	memset(mmap_addr, 0, pagesize*6);
      }
      
      void fin(void)
      {
      	void* addr = mmap_addr - pagesize;
      	munmap(addr, pagesize*8);
      
      	memset(buf, 0, sizeof(buf));
      	memset(retbuf, 0, sizeof(retbuf));
      }
      
      void mem_bind(int index, int len)
      {
      	int err;
      
      	err = mbind(mmap_addr+pagesize*index, pagesize*len,
      		    MPOL_BIND, nmask->maskp, nmask->size, 0);
      	if (err)
      		perror("mbind "), exit(err);
      }
      
      void mem_interleave(int index, int len)
      {
      	int err;
      
      	err = mbind(mmap_addr+pagesize*index, pagesize*len,
      		    MPOL_INTERLEAVE, nmask->maskp, nmask->size, 0);
      	if (err)
      		perror("mbind "), exit(err);
      }
      
      void mem_unbind(int index, int len)
      {
      	int err;
      
      	err = mbind(mmap_addr+pagesize*index, pagesize*len,
      		    MPOL_DEFAULT, NULL, 0, 0);
      	if (err)
      		perror("mbind "), exit(err);
      }
      
      void Assert(char *expected, char *value, char *name, int line)
      {
      	if (strcmp(expected, value) == 0) {
      		fprintf(stderr, "%s: passed\n", name);
      		return;
      	}
      	else {
      		fprintf(stderr, "%s: %d: test failed. expect '%s', actual '%s'\n",
      			name, line,
      			expected, value);
      //		exit(1);
      	}
      }
      
      /*
            AAAA
          PPPPPPNNNNNN
          might become
          PPNNNNNNNNNN
          case 4 below
      */
      void case4(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 4);
      	mem_unbind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("2,4", retbuf, "case4", __LINE__);
      
      	fin();
      }
      
      /*
             AAAA
       PPPPPPNNNNNN
       might become
       PPPPPPPPPPNN
       case 5 below
      */
      void case5(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_bind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("4,2", retbuf, "case5", __LINE__);
      
      	fin();
      }
      
      /*
      	    AAAA
      	PPPPNNNNXXXX
      	might become
      	PPPPPPPPPPPP 6
      */
      void case6(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_bind(4, 2);
      	mem_bind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("6", retbuf, "case6", __LINE__);
      
      	fin();
      }
      
      /*
          AAAA
      PPPPNNNNXXXX
      might become
      PPPPPPPPXXXX 7
      */
      void case7(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_interleave(4, 2);
      	mem_bind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("4,2", retbuf, "case7", __LINE__);
      
      	fin();
      }
      
      /*
          AAAA
      PPPPNNNNXXXX
      might become
      PPPPNNNNNNNN 8
      */
      void case8(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	mem_bind(0, 2);
      	mem_interleave(4, 2);
      	mem_interleave(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("2,4", retbuf, "case8", __LINE__);
      
      	fin();
      }
      
      void case_n(void)
      {
      	init();
      	sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);
      
      	/* make redundunt mappings [0][1234][34][7] */
      	mmap(mmap_addr + pagesize*4, pagesize*2, PROT_READ|PROT_WRITE,
      	     MAP_FIXED|MAP_SHARED, mapped_fd, pagesize*3);
      
      	/* Expect to do nothing. */
      	mem_unbind(2, 2);
      
      	file = popen(buf, "r");
      	fread(retbuf, sizeof(retbuf), 1, file);
      	Assert("4,2", retbuf, "case_n", __LINE__);
      
      	fin();
      }
      
      int main(int argc, char** argv)
      {
      	case4();
      	case5();
      	case6();
      	case7();
      	case8();
      	case_n();
      
      	return 0;
      }
      =============================================================
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Caspar Zhang <caspar@casparzhang.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: <stable@vger.kernel.org>		[3.1.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e26a5114
    • Hans de Goede's avatar
      gspca: Fix bulk mode cameras no longer working (regression fix) · 757e55c2
      Hans de Goede authored
      The new iso bandwidth calculation code accidentally has broken support
      for bulk mode cameras. This has broken the following drivers:
      finepix, jeilinj, ovfx2, ov534, ov534_9, se401, sq905, sq905c, sq930x,
      stv0680, vicam.
      
      Thix patch fixes this. Fix tested with: se401, sq905, sq905c, stv0680 & vicam
      cams.
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      757e55c2
  2. 27 Dec, 2011 2 commits
  3. 26 Dec, 2011 8 commits
  4. 25 Dec, 2011 4 commits
  5. 24 Dec, 2011 4 commits
  6. 23 Dec, 2011 14 commits
  7. 22 Dec, 2011 4 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://neil.brown.name/md · ad1fca20
      Linus Torvalds authored
      * 'for-linus' of git://neil.brown.name/md:
        md/bitmap: It is OK to clear bits during recovery.
        md: don't give up looking for spares on first failure-to-add
        md/raid5: ensure correct assessment of drives during degraded reshape.
        md/linear: fix hot-add of devices to linear arrays.
      ad1fca20
    • NeilBrown's avatar
      md/bitmap: It is OK to clear bits during recovery. · 961902c0
      NeilBrown authored
      commit d0a4bb49 introduced a
      regression which is annoying but fairly harmless.
      
      When writing to an array that is undergoing recovery (a spare
      in being integrated into the array), writing to the array will
      set bits in the bitmap, but they will not be cleared when the
      write completes.
      
      For bits covering areas that have not been recovered yet this is not a
      problem as the recovery will clear the bits.  However bits set in
      already-recovered region will stay set and never be cleared.
      This doesn't risk data integrity.  The only negatives are:
       - next time there is a crash, more resyncing than necessary will
         be done.
       - the bitmap doesn't look clean, which is confusing.
      
      While an array is recovering we don't want to update the
      'events_cleared' setting in the bitmap but we do still want to clear
      bits that have very recently been set - providing they were written to
      the recovering device.
      
      So split those two needs - which previously both depended on 'success'
      and always clear the bit of the write went to all devices.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      961902c0
    • NeilBrown's avatar
      md: don't give up looking for spares on first failure-to-add · 60fc1370
      NeilBrown authored
      Before performing a recovery we try to remove any spares that
      might not be working, then add any that might have become relevant.
      
      Currently we abort on the first spare that cannot be added.
      This is a false optimisation.
      It is conceivable that - depending on rules in the personality - a
      subsequent spare might be accepted.
      Also the loop does other things like count the available spares and
      reset the 'recovery_offset' value.
      
      If we abort early these might not happen properly.
      
      So remove the early abort.
      
      In particular if you have an array what is undergoing recovery and
      which has extra spares, then the recovery may not restart after as
      reboot as the could of 'spares' might end up as zero.
      Reported-by: default avatarAnssi Hannula <anssi.hannula@iki.fi>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      60fc1370
    • NeilBrown's avatar
      md/raid5: ensure correct assessment of drives during degraded reshape. · 30d7a483
      NeilBrown authored
      While reshaping a degraded array (as when reshaping a RAID0 by first
      converting it to a degraded RAID4) we currently get confused about
      which devices are in_sync.  In most cases we get it right, but in the
      region that is being reshaped we need to treat non-failed devices as
      in-sync when we have the data but haven't actually written it out yet.
      Reported-by: default avatarAdam Kwolek <adam.kwolek@intel.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      30d7a483