1. 12 Feb, 2008 7 commits
    • KOSAKI Motohiro's avatar
      mempolicy: silently restrict nodemask to allowed nodes · 31f1de46
      KOSAKI Motohiro authored
      Kosaki Motohito noted that "numactl --interleave=all ..." failed in the
      presence of memoryless nodes.  This patch attempts to fix that problem.
      
      Some background:
      
      numactl --interleave=all calls set_mempolicy(2) with a fully populated
      [out to MAXNUMNODES] nodemask.  set_mempolicy() [in do_set_mempolicy()]
      calls contextualize_policy() which requires that the nodemask be a
      subset of the current task's mems_allowed; else EINVAL will be returned.
      
      A task's mems_allowed will always be a subset of node_states[N_HIGH_MEMORY]
      i.e., nodes with memory.  So, a fully populated nodemask will be
      declared invalid if it includes memoryless nodes.
      
        NOTE:  the same thing will occur when running in a cpuset
               with restricted mem_allowed--for the same reason:
               node mask contains dis-allowed nodes.
      
      mbind(2), on the other hand, just masks off any nodes in the nodemask
      that are not included in the caller's mems_allowed.
      
      In each case [mbind() and set_mempolicy()], mpol_check_policy() will
      complain [again, resulting in EINVAL] if the nodemask contains any
      memoryless nodes.  This is somewhat redundant as mpol_new() will remove
      memoryless nodes for interleave policy, as will bind_zonelist()--called
      by mpol_new() for BIND policy.
      
      Proposed fix:
      
      1) modify contextualize_policy logic to:
         a) remember whether the incoming node mask is empty.
         b) if not, restrict the nodemask to allowed nodes, as is
            currently done in-line for mbind().  This guarantees
            that the resulting mask includes only nodes with memory.
      
            NOTE:  this is a [benign, IMO] change in behavior for
                   set_mempolicy().  Dis-allowed nodes will be
                   silently ignored, rather than returning an error.
      
         c) fold this code into mpol_check_policy(), replace 2 calls to
            contextualize_policy() to call mpol_check_policy() directly
            and remove contextualize_policy().
      
      2) In existing mpol_check_policy() logic, after "contextualization":
         a) MPOL_DEFAULT:  require that in coming mask "was_empty"
         b) MPOL_{BIND|INTERLEAVE}:  require that contextualized nodemask
            contains at least one node.
         c) add a case for MPOL_PREFERRED:  if in coming was not empty
            and resulting mask IS empty, user specified invalid nodes.
            Return EINVAL.
         c) remove the now redundant check for memoryless nodes
      
      3) remove the now redundant masking of policy nodes for interleave
         policy from mpol_new().
      
      4) Now that mpol_check_policy() contextualizes the nodemask, remove
         the in-line nodes_and() from sys_mbind().  I believe that this
         restores mbind() to the behavior before the memoryless-nodes
         patch series.  E.g., we'll no longer treat an invalid nodemask
         with MPOL_PREFERRED as local allocation.
      
      [ Patch history:
      
        v1 -> v2:
         - Communicate whether or not incoming node mask was empty to
           mpol_check_policy() for better error checking.
         - As suggested by David Rientjes, remove the now unused
           cpuset_nodes_subset_current_mems_allowed() from cpuset.h
      
        v2 -> v3:
         - As suggested by Kosaki Motohito, fold the "contextualization"
           of policy nodemask into mpol_check_policy().  Looks a little
           cleaner. ]
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Tested-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      31f1de46
    • Linus Torvalds's avatar
    • Jonathan Corbet's avatar
      Be more robust about bad arguments in get_user_pages() · 900cf086
      Jonathan Corbet authored
      So I spent a while pounding my head against my monitor trying to figure
      out the vmsplice() vulnerability - how could a failure to check for
      *read* access turn into a root exploit? It turns out that it's a buffer
      overflow problem which is made easy by the way get_user_pages() is
      coded.
      
      In particular, "len" is a signed int, and it is only checked at the
      *end* of a do {} while() loop.  So, if it is passed in as zero, the loop
      will execute once and decrement len to -1.  At that point, the loop will
      proceed until the next invalid address is found; in the process, it will
      likely overflow the pages array passed in to get_user_pages().
      
      I think that, if get_user_pages() has been asked to grab zero pages,
      that's what it should do.  Thus this patch; it is, among other things,
      enough to block the (already fixed) root exploit and any others which
      might be lurking in similar code.  I also think that the number of pages
      should be unsigned, but changing the prototype of this function probably
      requires some more careful review.
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      900cf086
    • Linus Torvalds's avatar
    • Pekka Enberg's avatar
      Add Matt to MAINTAINERS as a SLAB allocator maintainer · c76d118e
      Pekka Enberg authored
      Matt is already the maintainer of SLOB which is one of the "SLAB" allocators in
      the kernel so add him to MAINTAINERS.
      Signed-off-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c76d118e
    • Linus Torvalds's avatar
      Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev · a17b7a39
      Linus Torvalds authored
      * 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
        sata_mv: platform driver allocs dma without create
        pata_ninja32: setup changes
        pata_legacy: typo fix
        pata_amd: Note in the module description it handles Nvidia
        sata_mv: fix loop with last port
        libata: ignore deverr on SETXFER if mode is configured
        pata_via: fix SATA cable detection on cx700
      a17b7a39
    • Andi Kleen's avatar
      Make topology fallback macros reference their arguments. · 271cad6d
      Andi Kleen authored
      This avoids warnings with unreferenced variables in the !NUMA case.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      271cad6d
  2. 11 Feb, 2008 18 commits
    • Olof Johansson's avatar
      mlx4_core: Fix build break (missing include) · 29c27112
      Olof Johansson authored
      Commit 313abe55 ("mlx4_core: For 64-bit systems, vmap() kernel queue
      buffers") caused this to pop up on powerpc allyesconfig, looks like a
      missing include file:
      
          drivers/net/mlx4/alloc.c: In function 'mlx4_buf_alloc':
          drivers/net/mlx4/alloc.c:162: error: implicit declaration of function 'vmap'
          drivers/net/mlx4/alloc.c:162: error: 'VM_MAP' undeclared (first use in this function)
          drivers/net/mlx4/alloc.c:162: error: (Each undeclared identifier is reported only once
          drivers/net/mlx4/alloc.c:162: error: for each function it appears in.)
          drivers/net/mlx4/alloc.c:162: warning: assignment makes pointer from integer without a cast
          drivers/net/mlx4/alloc.c: In function 'mlx4_buf_free':
          drivers/net/mlx4/alloc.c:187: error: implicit declaration of function 'vunmap'
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
      29c27112
    • Tony Luck's avatar
      [IA64] Fix build for sim_defconfig · 10d0aa3c
      Tony Luck authored
      Commit bdc80787 broke the build
      for this config because the sim_defconfig selects CONFIG_HZ=250
      but include/asm-ia64/param.h has an ifdef for the simulator to
      force HZ to 32.  So we ended up with a kernel/timeconst.h set
      for HZ=250 ... which then failed the check for the right HZ
      value and died with:
      
      Drop the #ifdef magic from param.h and make force CONFIG_HZ=32
      directly for the simulator.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      10d0aa3c
    • Byron Bradley's avatar
      sata_mv: platform driver allocs dma without create · fbf14e2f
      Byron Bradley authored
      When the sata_mv driver is used as a platform driver,
      mv_create_dma_pools() is never called so it fails when trying
      to alloc in mv_pool_start().
      Signed-off-by: default avatarByron Bradley <byron.bbradley@gmail.com>
      Acked-by: default avatarMark Lord <mlord@pobox.com>
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      fbf14e2f
    • Alan Cox's avatar
      pata_ninja32: setup changes · 41946450
      Alan Cox authored
      Forcibly set more of the configuration at init time. This seems to fix at
      least one problem reported. We don't know what most of these bits do, but
      we do know what windows stuffs there.
      Signed-off-by: default avatarAlan Cox <alan@redhat.com>
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      41946450
    • Alan Cox's avatar
      pata_legacy: typo fix · 8397248d
      Alan Cox authored
      Signed-off-by: default avatarAlan Cox <alan@redhat.com>
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      8397248d
    • Alan Cox's avatar
      pata_amd: Note in the module description it handles Nvidia · c9544bcb
      Alan Cox authored
      This has confused a few people so fix it
      Signed-off-by: default avatarAlan Cox <alan@redhat.com>
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      c9544bcb
    • Yinghai Lu's avatar
      sata_mv: fix loop with last port · 8f71efe2
      Yinghai Lu authored
      commit f351b2d6
              sata_mv: Support SoC controllers
      
      cause panic:
      
      scsi 4:0:0:0: Direct-Access     ATA      HITACHI HDS7225S V44O PQ: 0 ANSI: 5
      sd 4:0:0:0: [sde] 488390625 512-byte hardware sectors (250056 MB)
      sd 4:0:0:0: [sde] Write Protect is off
      sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
      sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
      sd 4:0:0:0: [sde] 488390625 512-byte hardware sectors (250056 MB)
      sd 4:0:0:0: [sde] Write Protect is off
      sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
      sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
       sde:<1>BUG: unable to handle kernel NULL pointer dereference at 000000000000001a
      IP: [<ffffffff806262c7>] mv_interrupt+0x21c/0x4cc
      PGD 0
      Oops: 0000 [1] SMP
      CPU 3
      Modules linked in:
      Pid: 0, comm: swapper Not tainted 2.6.24-smp-08636-g0afc2edf-dirty #26
      RIP: 0010:[<ffffffff806262c7>]  [<ffffffff806262c7>] mv_interrupt+0x21c/0x4cc
      RSP: 0000:ffff8102050bbec8  EFLAGS: 00010297
      RAX: 0000000000000008 RBX: 0000000000000000 RCX: 0000000000000003
      RDX: 0000000000008000 RSI: 0000000000000286 RDI: ffff8102035180e0
      RBP: 0000000000000001 R08: 0000000000000003 R09: ffff8102036613e0
      R10: 0000000000000002 R11: ffffffff8061474c R12: ffff8102035bf828
      R13: 0000000000000008 R14: ffff81020348ece8 R15: ffffc20002cb2000
      FS:  0000000000000000(0000) GS:ffff810405025700(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 000000000000001a CR3: 0000000000201000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffff810405094000, task ffff8102050b28c0)
      Stack:  000000010000000c 0002040000220400 0000001100000002 ffff81020348eda8
       0000000000000001 ffff8102035f2cc0 0000000000000000 0000000000000000
       0000000000000018 0000000000000000 0000000000000000 ffffffff80269ee8
      Call Trace:
       <IRQ>  [<ffffffff80269ee8>] ? handle_IRQ_event+0x25/0x53
       [<ffffffff8026b393>] ? handle_fasteoi_irq+0x90/0xc8
       [<ffffffff802218e2>] ? do_IRQ+0xf1/0x15f
       [<ffffffff8021df24>] ? default_idle+0x0/0x55
       [<ffffffff8021f361>] ? ret_from_intr+0x0/0xa
       <EOI>  [<ffffffff8023010c>] ? lapic_next_event+0x0/0xa
       [<ffffffff8021df55>] ? default_idle+0x31/0x55
       [<ffffffff8021df50>] ? default_idle+0x2c/0x55
       [<ffffffff8021df24>] ? default_idle+0x0/0x55
       [<ffffffff8021e00b>] ? cpu_idle+0x92/0xb8
      
      Code: 41 14 85 c0 89 44 24 14 0f 84 9d 02 00 00 f7 d0 01 d6 41 89 d5 89 41 14 8b 41 14 89 34 24 e9 7e 02 00 00 49 63 c5 49 8b 5c c6 48 <f6> 43 1a 80 4c 8b a3 20 37 00 00 0f 85 62 02 00 00 31 c9 41 83
      RIP  [<ffffffff806262c7>] mv_interrupt+0x21c/0x4cc
       RSP <ffff8102050bbec8>
      CR2: 000000000000001a
      ---[ end trace 2583b5f7a5350584 ]---
      Kernel panic - not syncing: Aiee, killing interrupt handler!
      
      last_port already include port0 base.
      this patch change use last_port directly, and move pp assignment later.
      Signed-off-by: default avatarYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      8f71efe2
    • Tejun Heo's avatar
      libata: ignore deverr on SETXFER if mode is configured · 4055dee7
      Tejun Heo authored
      Some controllers (VIA CX700) raise device error on SETXFER even after
      mode configuration succeeded.  Update ata_dev_set_mode() such that
      device error is ignored if transfer mode is configured correctly.  To
      implement this, device is revalidated even after device error on
      SETXFER.
      
      This fixes kernel bugzilla bug 8563.
      Signed-off-by: default avatarTejun Heo <htejun@gmail.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      4055dee7
    • Tejun Heo's avatar
      pata_via: fix SATA cable detection on cx700 · 7585eb1b
      Tejun Heo authored
      The first port of cx700 is SATA.  Fix cable detection.
      Signed-off-by: default avatarTejun Heo <htejun@gmail.com>
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      7585eb1b
    • Thomas Gleixner's avatar
      x86: remove over noisy debug printk · 81772fea
      Thomas Gleixner authored
      pageattr-test.c contains a noisy debug printk that people reported.
      The condition under which it prints (randomly tapping into a mem_map[]
      hole and not being able to c_p_a() there) is valid behavior and not
      interesting to report.
      
      Remove it.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      81772fea
    • Linus Torvalds's avatar
    • Linus Torvalds's avatar
    • Andi Kleen's avatar
      Prevent IDE boot ops on NUMA system · 1f07e988
      Andi Kleen authored
      Without this patch a Opteron test system here oopses at boot with
      current git.
      
      Calling to_pci_dev() on a NULL pointer gives a negative value so the
      following NULL pointer check never triggers and then an illegal address
      is referenced.  Check the unadjusted original device pointer for NULL
      instead.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f07e988
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux · 0c0d61ca
      Linus Torvalds authored
      * 'for-linus' of git://linux-nfs.org/~bfields/linux:
        SUNPRC: Fix printk format warning
        nfsd: clean up svc_reserve_auth()
        NLM: don't requeue block if it was invalidated while GRANT_MSG was in flight
        NLM: don't reattempt GRANT_MSG when there is already an RPC in flight
        NLM: have server-side RPC clients default to soft RPC tasks
        NLM: set RPC_CLNT_CREATE_NOPING for NLM RPC clients
      0c0d61ca
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 · eedcdefb
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
        ide: remove stale comment from ide-lib.c
        ide: fix comment in init_irq()
        ide: ide_init_port() bugfix
        ide-disk: fix flush requests (take 2)
        ide: introduce CONFIG_BLK_DEV_IDEDMA_SFF option
        bast-ide: build fix
        ide-tape: remove never executed code
        ide: fix ide/legacy/gayle.c compilation
        ide-cd: replace ntohs with generic byteorder macro be16_to_cpu
        ide: remove stale version number
        pdc202xx_old: always enable burst mode
        palm_bk3710: use struct ide_port_info
        palm_bk3710: port initialization/probing bugfix
        palm_bk3710: fix ide_unregister() usage
        palm_bk3710: ide_register_hw() -> ide_device_add()
        ide: insert BUG_ON() into __ide_set_handler() (take 2)
        cs5520: remove stale comment
        ide: another possible ide panic fix for blk-end-request
      eedcdefb
    • Sam Ravnborg's avatar
      kbuild: fix make V=1 · fab1e310
      Sam Ravnborg authored
      When make -s support were added to filechk to
      combination created with make V=1 were not
      covered.
      Fix it by explicitly cover this case too.
      Signed-off-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Cc: Mike Frysinger <vapier@gentoo.org>
      fab1e310
    • Matthew Wilcox's avatar
      Use proper abstractions in quirk_intel_irqbalance · 9585ca02
      Matthew Wilcox authored
      Since we may not have a pci_dev for the device we need to access, we can't
      use pci_read_config_word.  But raw_pci_read is an internal implementation
      detail; it's better to use the architected pci_bus_read_config_word
      interface.  Using PCI_DEVFN instead of a mysterious constant helps
      reassure everyone that we really do intend to access device 8.
      
      [ Thanks to Grant Grundler for pointing out to me that this is exactly
        what the write immediately above this is doing -- enabling device 8 to
        respond to config space cycles.
      					- Matthew
      
        Grant also says:
      
      	"Can you also add a comment which points at the Intel
      	 documentation?
      
      	 The 'Intel E7320 Memory Controller Hub (MCH) Datasheet' at
      
      	  http://download.intel.com/design/chipsets/datashts/30300702.pdf
      
      	 Page 69 documents register F4h (DEVPRES1).
      
      	 And I just doubled checked that the 0xf4 register value is
      	 restored later in the quirk (obvious when you look at the code
      	 but not from the patch"
      
        so here it is.
      					 - Linus ]
      Signed-off-by: default avatarMatthew Wilcox <willy@linux.intel.com>
      Acked-by: default avatarGrant Grundler <grundler@parisc-linux.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9585ca02
    • Stephen Smalley's avatar
      selinux: support 64-bit capabilities · b68e418c
      Stephen Smalley authored
      Fix SELinux to handle 64-bit capabilities correctly, and to catch
      future extensions of capabilities beyond 64 bits to ensure that SELinux
      is properly updated.
      Signed-off-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      b68e418c
  3. 10 Feb, 2008 15 commits