1. 10 Oct, 2013 3 commits
    • Bharat Bhushan's avatar
      kvm: ppc: booke: check range page invalidation progress on page setup · 40fde70d
      Bharat Bhushan authored
      When the MM code is invalidating a range of pages, it calls the KVM
      kvm_mmu_notifier_invalidate_range_start() notifier function, which calls
      kvm_unmap_hva_range(), which arranges to flush all the TLBs for guest pages.
      However, the Linux PTEs for the range being flushed are still valid at
      that point.  We are not supposed to establish any new references to pages
      in the range until the ...range_end() notifier gets called.
      The PPC-specific KVM code doesn't get any explicit notification of that;
      instead, we are supposed to use mmu_notifier_retry() to test whether we
      are or have been inside a range flush notifier pair while we have been
      referencing a page.
      
      This patch calls the mmu_notifier_retry() while mapping the guest
      page to ensure we are not referencing a page when in range invalidation.
      
      This call is inside a region locked with kvm->mmu_lock, which is the
      same lock that is called by the KVM MMU notifier functions, thus
      ensuring that no new notification can proceed while we are in the
      locked region.
      Signed-off-by: default avatarBharat Bhushan <bharat.bhushan@freescale.com>
      Acked-by: default avatarAlexander Graf <agraf@suse.de>
      [Backported to 3.12 - Paolo]
      Reviewed-by: default avatarBharat Bhushan <bharat.bhushan@freescale.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      40fde70d
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Fix typo in saving DSCR · cfc86025
      Paul Mackerras authored
      This fixes a typo in the code that saves the guest DSCR (Data Stream
      Control Register) into the kvm_vcpu_arch struct on guest exit.  The
      effect of the typo was that the DSCR value was saved in the wrong place,
      so changes to the DSCR by the guest didn't persist across guest exit
      and entry, and some host kernel memory got corrupted.
      
      Cc: stable@vger.kernel.org [v3.1+]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cfc86025
    • Gleb Natapov's avatar
      KVM: nVMX: fix shadow on EPT · d0d538b9
      Gleb Natapov authored
      72f85795 broke shadow on EPT. This patch reverts it and fixes PAE
      on nEPT (which reverted commit fixed) in other way.
      
      Shadow on EPT is now broken because while L1 builds shadow page table
      for L2 (which is PAE while L2 is in real mode) it never loads L2's
      GUEST_PDPTR[0-3].  They do not need to be loaded because without nested
      virtualization HW does this during guest entry if EPT is disabled,
      but in our case L0 emulates L2's vmentry while EPT is enables, so we
      cannot rely on vmcs12->guest_pdptr[0-3] to contain up-to-date values
      and need to re-read PDPTEs from L2 memory. This is what kvm_set_cr3()
      is doing, but by clearing cache bits during L2 vmentry we drop values
      that kvm_set_cr3() read from memory.
      
      So why the same code does not work for PAE on nEPT? kvm_set_cr3()
      reads pdptes into vcpu->arch.walk_mmu->pdptrs[]. walk_mmu points to
      vcpu->arch.nested_mmu while nested guest is running, but ept_load_pdptrs()
      uses vcpu->arch.mmu which contain incorrect values. Fix that by using
      walk_mmu in ept_(load|save)_pdptrs.
      Signed-off-by: default avatarGleb Natapov <gleb@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Tested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d0d538b9
  2. 03 Oct, 2013 22 commits
  3. 02 Oct, 2013 3 commits
  4. 01 Oct, 2013 12 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c31eeace
      Linus Torvalds authored
      Pull networking changes from David Miller:
      
       1) Multiply in netfilter IPVS can overflow when calculating destination
          weight.  From Simon Kirby.
      
       2) Use after free fixes in IPVS from Julian Anastasov.
      
       3) SFC driver bug fixes from Daniel Pieczko.
      
       4) Memory leak in pcan_usb_core failure paths, from Alexey Khoroshilov.
      
       5) Locking and encapsulation fixes to serial line CAN driver, from
          Andrew Naujoks.
      
       6) Duplex and VF handling fixes to bnx2x driver from Yaniv Rosner,
          Eilon Greenstein, and Ariel Elior.
      
       7) In lapb, if no other packets are outstanding, T1 timeouts actually
          stall things and no packet gets sent.  Fix from Josselin Costanzi.
      
       8) ICMP redirects should not make it to the socket error queues, from
          Duan Jiong.
      
       9) Fix bugs in skge DMA mapping error handling, from Nikulas Patocka.
      
      10) Fix setting of VLAN priority field on via-rhine driver, from Roget
          Luethi.
      
      11) Fix TX stalls and VLAN promisc programming in be2net driver from
          Ajit Khaparde.
      
      12) Packet padding doesn't get handled correctly in new usbnet SG
          support code, from Ming Lei.
      
      13) Fix races in netdevice teardown wrt.  network namespace closing.
          From Eric W.  Biederman.
      
      14) Fix potential missed initialization of net_secret if not TCP
          connections are openned.  From Eric Dumazet.
      
      15) Cinterion PLXX product ID in qmi_wwan driver is wrong, from
          Aleksander Morgado.
      
      16) skb_cow_head() can change skb->data and thus packet header pointers,
          don't use stale ip_hdr reference in ip_tunnel code.
      
      17) Backend state transition handling fixes in xen-netback, from Paul
          Durrant.
      
      18) Packet offset for AH protocol is handled wrong in flow dissector,
          from Eric Dumazet.
      
      19) Taking down an fq packet scheduler instance can leave stale packets
          in the queues, fix from Eric Dumazet.
      
      20) Fix performance regressions introduced by TCP Small Queues.  From
          Eric Dumazet.
      
      21) IPV6 GRE tunneling code calculates max_headroom incorrectly, from
          Hannes Frederic Sowa.
      
      22) Multicast timer handlers in ipv4 and ipv6 can be the last and final
          reference to the ipv4/ipv6 specific network device state, so use the
          reference put that will check and release the object if the
          reference hits zero.  From Salam Noureddine.
      
      23) Fix memory corruption in ip_tunnel driver, and use skb_push()
          instead of __skb_push() so that similar bugs are less hard to find.
          From Steffen Klassert.
      
      24) Add forgotten hookup of rtnl_ops in SIT and ip6tnl drivers, from
          Nicolas Dichtel.
      
      25) fq scheduler doesn't accurately rate limit in certain circumstances,
          from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
        pkt_sched: fq: rate limiting improvements
        ip6tnl: allow to use rtnl ops on fb tunnel
        sit: allow to use rtnl ops on fb tunnel
        ip_tunnel: Remove double unregister of the fallback device
        ip_tunnel_core: Change __skb_push back to skb_push
        ip_tunnel: Add fallback tunnels to the hash lists
        ip_tunnel: Fix a memory corruption in ip_tunnel_xmit
        qlcnic: Fix SR-IOV configuration
        ll_temac: Reset dma descriptors indexes on ndo_open
        skbuff: size of hole is wrong in a comment
        ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put
        ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put
        ethernet: moxa: fix incorrect placement of __initdata tag
        ipv6: gre: correct calculation of max_headroom
        powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file
        Revert "powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file"
        bonding: Fix broken promiscuity reference counting issue
        tcp: TSQ can use a dynamic limit
        dm9601: fix IFF_ALLMULTI handling
        pkt_sched: fq: qdisc dismantle fixes
        ...
      c31eeace
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 0b936842
      Linus Torvalds authored
      Pull sparc fix from David Miller:
       "Just a single bug fix to a regression added during some strlcpy()
        conversions"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Fix buggy strlcpy() conversion in ldom_reboot().
      0b936842
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 517bf8fc
      Linus Torvalds authored
      Pull vfs lru leak fix from Al Viro:
       "The fix in "super: fix for destroy lrus" didn't - they need to be
        destroyed, all right, but that's the wrong place..."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs/super.c: fix lru_list leak for real
      517bf8fc
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · 77c4ad8e
      Linus Torvalds authored
      Pull two KVM fixes from Gleb Natapov.
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: VMX: do not check bit 12 of EPT violation exit qualification when undefined
        ARM: kvm: rename cpu_reset to avoid name clash
      77c4ad8e
    • Al Viro's avatar
      fs/super.c: fix lru_list leak for real · c2d22ecd
      Al Viro authored
      Freeing ->s_{inode,dentry}_lru in deactivate_locked_super() is wrong;
      the right place is destroy_super().  As it is, we leak them if sget()
      decides that new superblock it has allocated (and never shown to
      anybody) isn't needed and should be freed.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      c2d22ecd
    • Jason Gunthorpe's avatar
      bus: mvebu-mbus: Fix optional pcie-mem/io-aperture properties · 8553bcad
      Jason Gunthorpe authored
      If the property was not specified then the returned resource had a
      resource_size(..) == 1, rather than 0. The PCI-E driver checks for 0 so it
      blindly continues on with a corrupted resource.
      
      The regression was introduced into v3.12 by:
      
        11be6547 PCI: mvebu: Adapt to the new device tree layout
      Signed-off-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      8553bcad
    • Eric Dumazet's avatar
      pkt_sched: fq: rate limiting improvements · 0eab5eb7
      Eric Dumazet authored
      FQ rate limiting suffers from two problems, reported
      by Steinar :
      
      1) FQ enforces a delay when flow quantum is exhausted in order
      to reduce cpu overhead. But if packets are small, current
      delay computation is slightly wrong, and observed rates can
      be too high.
      
      Steinar had this problem because he disabled TSO and GSO,
      and default FQ quantum is 2*1514.
      
      (Of course, I wish recent TSO auto sizing changes will help
      to not having to disable TSO in the first place)
      
      2) maxrate was not used for forwarded flows (skbs not attached
      to a socket)
      
      Tested:
      
      tc qdisc add dev eth0 root est 1sec 4sec fq maxrate 8Mbit
      netperf -H lpq84 -l 1000 &
      sleep 10 ; tc -s qdisc show dev eth0
      qdisc fq 8003: root refcnt 32 limit 10000p flow_limit 100p buckets 1024
       quantum 3028 initial_quantum 15140 maxrate 8000Kbit
       Sent 16819357 bytes 11258 pkt (dropped 0, overlimits 0 requeues 0)
       rate 7831Kbit 653pps backlog 7570b 5p requeues 0
        44 flows (43 inactive, 1 throttled), next packet delay 2977352 ns
        0 gc, 0 highprio, 5545 throttled
      
      lpq83:~# tcpdump -p -i eth0 host lpq84 -c 12
      09:02:52.079484 IP lpq83 > lpq84: . 1389536928:1389538376(1448) ack 3808678021 win 457 <nop,nop,timestamp 961812 572609068>
      09:02:52.079499 IP lpq83 > lpq84: . 1448:2896(1448) ack 1 win 457 <nop,nop,timestamp 961812 572609068>
      09:02:52.079906 IP lpq84 > lpq83: . ack 2896 win 16384 <nop,nop,timestamp 572609080 961812>
      09:02:52.082568 IP lpq83 > lpq84: . 2896:4344(1448) ack 1 win 457 <nop,nop,timestamp 961815 572609071>
      09:02:52.082581 IP lpq83 > lpq84: . 4344:5792(1448) ack 1 win 457 <nop,nop,timestamp 961815 572609071>
      09:02:52.083017 IP lpq84 > lpq83: . ack 5792 win 16384 <nop,nop,timestamp 572609083 961815>
      09:02:52.085678 IP lpq83 > lpq84: . 5792:7240(1448) ack 1 win 457 <nop,nop,timestamp 961818 572609074>
      09:02:52.085693 IP lpq83 > lpq84: . 7240:8688(1448) ack 1 win 457 <nop,nop,timestamp 961818 572609074>
      09:02:52.086117 IP lpq84 > lpq83: . ack 8688 win 16384 <nop,nop,timestamp 572609086 961818>
      09:02:52.088792 IP lpq83 > lpq84: . 8688:10136(1448) ack 1 win 457 <nop,nop,timestamp 961821 572609077>
      09:02:52.088806 IP lpq83 > lpq84: . 10136:11584(1448) ack 1 win 457 <nop,nop,timestamp 961821 572609077>
      09:02:52.089217 IP lpq84 > lpq83: . ack 11584 win 16384 <nop,nop,timestamp 572609090 961821>
      Reported-by: default avatarSteinar H. Gunderson <sesse@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0eab5eb7
    • Nicolas Dichtel's avatar
      ip6tnl: allow to use rtnl ops on fb tunnel · bb814094
      Nicolas Dichtel authored
      rtnl ops where introduced by c075b130 ("ip6tnl: advertise tunnel param via
      rtnl"), but I forget to assign rtnl ops to fb tunnels.
      
      Now that it is done, we must remove the explicit call to
      unregister_netdevice_queue(), because  the fallback tunnel is added to the queue
      in ip6_tnl_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this
      is valid since commit 0bd87628 ("ip6tnl: add x-netns support")).
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb814094
    • Nicolas Dichtel's avatar
      sit: allow to use rtnl ops on fb tunnel · 205983c4
      Nicolas Dichtel authored
      rtnl ops where introduced by ba3e3f50 ("sit: advertise tunnel param via
      rtnl"), but I forget to assign rtnl ops to fb tunnels.
      
      Now that it is done, we must remove the explicit call to
      unregister_netdevice_queue(), because  the fallback tunnel is added to the queue
      in sit_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this
      is valid since commit 5e6700b3 ("sit: add support of x-netns")).
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      205983c4
    • David S. Miller's avatar
      Merge branch 'ip_tunnel' · 9cb17124
      David S. Miller authored
      ip_tunnel bug fixes from Steffen Klassert.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cb17124
    • Steffen Klassert's avatar
      ip_tunnel: Remove double unregister of the fallback device · cfe4a536
      Steffen Klassert authored
      When queueing the netdevices for removal, we queue the
      fallback device twice in ip_tunnel_destroy(). The first
      time when we queue all netdevices in the namespace and
      then again explicitly. Fix this by removing the explicit
      queueing of the fallback device.
      
      Bug was introduced when network namespace support was added
      with commit 6c742e71 ("ipip: add x-netns support").
      
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfe4a536
    • Steffen Klassert's avatar
      ip_tunnel_core: Change __skb_push back to skb_push · 78a3694d
      Steffen Klassert authored
      Git commit 0e6fbc5b ("ip_tunnels: extend iptunnel_xmit()")
      moved the IP header installation to iptunnel_xmit() and
      changed skb_push() to __skb_push(). This makes possible
      bugs hard to track down, so change it back to skb_push().
      
      Cc: Pravin Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78a3694d