1. 24 Jun, 2008 5 commits
    • Avi Kivity's avatar
      KVM: MMU: Fix oops on guest userspace access to guest pagetable · 6bf6a953
      Avi Kivity authored
      KVM has a heuristic to unshadow guest pagetables when userspace accesses
      them, on the assumption that most guests do not allow userspace to access
      pagetables directly. Unfortunately, in addition to unshadowing the pagetables,
      it also oopses.
      
      This never triggers on ordinary guests since sane OSes will clear the
      pagetables before assigning them to userspace, which will trigger the flood
      heuristic, unshadowing the pagetables before the first userspace access. One
      particular guest, though (Xenner) will run the kernel in userspace, triggering
      the oops.  Since the heuristic is incorrect in this case, we can simply
      remove it.
      Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
      6bf6a953
    • Marcelo Tosatti's avatar
      KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend) · 30945387
      Marcelo Tosatti authored
      kvm_mmu_pte_write() does not handle 32-bit non-PAE large page backed
      guests properly. It will instantiate two 2MB sptes pointing to the same
      physical 2MB page when a guest large pte update is trapped.
      
      Instead of duplicating code to handle this, disallow directory level
      updates to happen through kvm_mmu_pte_write(), so the two 2MB sptes
      emulating one guest 4MB pte can be correctly created by the page fault
      handling path.
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
      30945387
    • Marcelo Tosatti's avatar
      KVM: MMU: Fix rmap_write_protect() hugepage iteration bug · 6597ca09
      Marcelo Tosatti authored
      rmap_next() does not work correctly after rmap_remove(), as it expects
      the rmap chains not to change during iteration.  Fix (for now) by restarting
      iteration from the beginning.
      Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
      6597ca09
    • Marcelo Tosatti's avatar
      KVM: close timer injection race window in __vcpu_run · 06e05645
      Marcelo Tosatti authored
      If a timer fires after kvm_inject_pending_timer_irqs() but before
      local_irq_disable() the code will enter guest mode and only inject such
      timer interrupt the next time an unrelated event causes an exit.
      
      It would be simpler if the timer->pending irq conversion could be done
      with IRQ's disabled, so that the above problem cannot happen.
      
      For now introduce a new vcpu requests bit to cancel guest entry.
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
      06e05645
    • Marcelo Tosatti's avatar
      KVM: Fix race between timer migration and vcpu migration · d4acf7e7
      Marcelo Tosatti authored
      A guest vcpu instance can be scheduled to a different physical CPU
      between the test for KVM_REQ_MIGRATE_TIMER and local_irq_disable().
      
      If that happens, the timer will only be migrated to the current pCPU on
      the next exit, meaning that guest LAPIC timer event can be delayed until
      a host interrupt is triggered.
      
      Fix it by cancelling guest entry if any vcpu request is pending.  This
      has the side effect of nicely consolidating vcpu->requests checks.
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
      d4acf7e7
  2. 23 Jun, 2008 18 commits
  3. 22 Jun, 2008 1 commit
  4. 21 Jun, 2008 9 commits
    • Christoph Lameter's avatar
      Slab: Fix memory leak in fallback_alloc() · 481c5346
      Christoph Lameter authored
      The zonelist patches caused the loop that checks for available
      objects in permitted zones to not terminate immediately. One object
      per zone per allocation may be allocated and then abandoned.
      
      Break the loop when we have successfully allocated one object.
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      481c5346
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 62a8efe6
      Linus Torvalds authored
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        Ext4: Fix online resize block group descriptor corruption
      62a8efe6
    • Linus Torvalds's avatar
      Merge branch 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6 · bec95aab
      Linus Torvalds authored
      * 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6:
        hwmon: (lm75) sensor reading bugfix
        hwmon: (abituguru3) update driver detection
        hwmon: (w83791d) new maintainer
        hwmon: (abituguru3) Identify Abit AW8D board as such
        hwmon: Update the sysfs interface documentation
        hwmon: (adt7473) Initialize max_duty_at_overheat before use
        hwmon: (lm85) Fix function RANGE_TO_REG()
      bec95aab
    • Bernhard Walle's avatar
      Add return value to reserve_bootmem_node() · 71c2742f
      Bernhard Walle authored
      This patch changes the function reserve_bootmem_node() from void to int,
      returning -ENOMEM if the allocation fails.
      
      This fixes a build problem on x86 with CONFIG_KEXEC=y and
      CONFIG_NEED_MULTIPLE_NODES=y
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Reported-by: default avatarAdrian Bunk <bunk@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      71c2742f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · a1921443
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
        netns: Don't receive new packets in a dead network namespace.
        sctp: Make sure N * sizeof(union sctp_addr) does not overflow.
        pppoe: warning fix
        ipv6: Drop packets for loopback address from outside of the box.
        ipv6: Remove options header when setsockopt's optlen is 0
        mac80211: detect driver tx bugs
      a1921443
    • Eric W. Biederman's avatar
      netns: Don't receive new packets in a dead network namespace. · b9f75f45
      Eric W. Biederman authored
      Alexey Dobriyan <adobriyan@gmail.com> writes:
      > Subject: ICMP sockets destruction vs ICMP packets oops
      
      > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
      > icmp_reply() wants ICMP socket.
      >
      > Steps to reproduce:
      >
      > 	launch shell in new netns
      > 	move real NIC to netns
      > 	setup routing
      > 	ping -i 0
      > 	exit from shell
      >
      > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > PGD 17f3cd067 PUD 17f3ce067 PMD 0 
      > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
      > CPU 0 
      > Modules linked in: usblp usbcore
      > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
      > RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
      > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
      > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
      > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
      > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
      > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
      > FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
      > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
      > Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
      >  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
      >  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
      > Call Trace:
      >  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
      >  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
      >  [<ffffffff803fd645>] icmp_echo+0x65/0x70
      >  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
      >  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
      >  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
      >  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
      >  [<ffffffff803be57d>] process_backlog+0x9d/0x100
      >  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
      >  [<ffffffff80237be5>] __do_softirq+0x75/0x100
      >  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
      >  [<ffffffff8020f085>] do_softirq+0x65/0xa0
      >  [<ffffffff80237af7>] irq_exit+0x97/0xa0
      >  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
      >  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
      >  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
      >  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
      >  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
      >  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
      >  [<ffffffff8040f380>] ? rest_init+0x70/0x80
      > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
      > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
      > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
      > RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      >  RSP <ffffffff8057fc30>
      > CR2: 0000000000000000
      > ---[ end trace ea161157b76b33e8 ]---
      > Kernel panic - not syncing: Aiee, killing interrupt handler!
      
      Receiving packets while we are cleaning up a network namespace is a
      racy proposition. It is possible when the packet arrives that we have
      removed some but not all of the state we need to fully process it.  We
      have the choice of either playing wack-a-mole with the cleanup routines
      or simply dropping packets when we don't have a network namespace to
      handle them.
      
      Since the check looks inexpensive in netif_receive_skb let's just
      drop the incoming packets.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9f75f45
    • David S. Miller's avatar
      sctp: Make sure N * sizeof(union sctp_addr) does not overflow. · 735ce972
      David S. Miller authored
      As noticed by Gabriel Campana, the kmalloc() length arg
      passed in by sctp_getsockopt_local_addrs_old() can overflow
      if ->addr_num is large enough.
      
      Therefore, enforce an appropriate limit.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      735ce972
    • Stephen Hemminger's avatar
      pppoe: warning fix · 2645a3c3
      Stephen Hemminger authored
      Fix warning:
      drivers/net/pppoe.c: In function 'pppoe_recvmsg':
      drivers/net/pppoe.c:945: warning: comparison of distinct pointer types lacks a cast
      because skb->len is unsigned int and total_len is size_t
      Signed-off-by: default avatarStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2645a3c3
    • Linus Torvalds's avatar
      b732d968
  5. 20 Jun, 2008 7 commits