An error occurred fetching the project authors.
  1. 24 Aug, 2017 1 commit
    • Juergen Gross's avatar
      x86/paravirt/xen: Remove xen_patch() · edcb5cf8
      Juergen Gross authored
      Xen's paravirt patch function xen_patch() does some special casing for
      irq_ops functions to apply relocations when those functions can be
      patched inline instead of calls.
      
      Unfortunately none of the special case function replacements is small
      enough to be patched inline, so the special case never applies.
      
      As xen_patch() will call paravirt_patch_default() in all cases it can
      be just dropped. xen-asm.h doesn't seem necessary without xen_patch()
      as the only thing left in it would be the definition of XEN_EFLAGS_NMI
      used only once. So move that definition and remove xen-asm.h.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: lguest@lists.ozlabs.org
      Cc: rusty@rustcorp.com.au
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/20170816173157.8633-2-jgross@suse.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      edcb5cf8
  2. 13 Jun, 2017 2 commits
    • Ankur Arora's avatar
      xen/vcpu: Handle xen_vcpu_setup() failure in hotplug · c9b5d98b
      Ankur Arora authored
      The hypercall VCPUOP_register_vcpu_info can fail. This failure is
      handled by making per_cpu(xen_vcpu, cpu) point to its shared_info
      slot and those without one (cpu >= MAX_VIRT_CPUS) be NULL.
      
      For PVH/PVHVM, this is not enough, because we also need to pull
      these VCPUs out of circulation.
      
      Fix for PVH/PVHVM: on registration failure in the cpuhp prepare
      callback (xen_cpu_up_prepare_hvm()), return an error to the cpuhp
      state-machine so it can fail the CPU init.
      
      Fix for PV: the registration happens before smp_init(), so, in the
      failure case we clamp setup_max_cpus and limit the number of VCPUs
      that smp_init() will bring-up to MAX_VIRT_CPUS.
      This is functionally correct but it makes the code a bit simpler
      if we get rid of this explicit clamping: for VCPUs that don't have
      valid xen_vcpu, fail the CPU init in the cpuhp prepare callback
      (xen_cpu_up_prepare_pv()).
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      c9b5d98b
    • Ankur Arora's avatar
      xen/vcpu: Simplify xen_vcpu related code · ad73fd59
      Ankur Arora authored
      Largely mechanical changes to aid unification of xen_vcpu_restore()
      logic for PV, PVH and PVHVM.
      
      xen_vcpu_setup(): the only change in logic is that clamp_max_cpus()
      is now handled inside the "if (!xen_have_vcpu_info_placement)" block.
      
      xen_vcpu_restore(): code movement from enlighten_pv.c to enlighten.c.
      
      xen_vcpu_info_reset(): pulls together all the code where xen_vcpu
      is set to default.
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarAnkur Arora <ankur.a.arora@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      ad73fd59
  3. 02 May, 2017 4 commits
  4. 07 Feb, 2017 1 commit
  5. 25 Dec, 2016 1 commit
  6. 25 Jul, 2016 1 commit
  7. 23 Nov, 2015 1 commit
  8. 20 Aug, 2015 5 commits
  9. 10 Aug, 2015 1 commit
    • Jason A. Donenfeld's avatar
      x86/xen: build "Xen PV" APIC driver for domU as well · fc5fee86
      Jason A. Donenfeld authored
      It turns out that a PV domU also requires the "Xen PV" APIC
      driver. Otherwise, the flat driver is used and we get stuck in busy
      loops that never exit, such as in this stack trace:
      
      (gdb) target remote localhost:9999
      Remote debugging using localhost:9999
      __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
      56              while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY)
      (gdb) bt
       #0  __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
       #1  __default_send_IPI_shortcut (shortcut=<optimized out>,
      dest=<optimized out>, vector=<optimized out>) at
      ./arch/x86/include/asm/ipi.h:75
       #2  apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54
       #3  0xffffffff81011336 in arch_irq_work_raise () at
      arch/x86/kernel/irq_work.c:47
       #4  0xffffffff8114990c in irq_work_queue (work=0xffff88000fc0e400) at
      kernel/irq_work.c:100
       #5  0xffffffff8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633
       #6  0xffffffff8110ca60 in vprintk_emit (facility=0, level=<optimized
      out>, dict=0x0 <irq_stack_union>, dictlen=<optimized out>,
      fmt=<optimized out>, args=<optimized out>)
          at kernel/printk/printk.c:1778
       #7  0xffffffff816010c8 in printk (fmt=<optimized out>) at
      kernel/printk/printk.c:1868
       #8  0xffffffffc00013ea in ?? ()
       #9  0x0000000000000000 in ?? ()
      
      Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      fc5fee86
  10. 22 Apr, 2015 1 commit
  11. 28 Jan, 2015 1 commit
  12. 04 Dec, 2014 3 commits
    • Juergen Gross's avatar
      xen: switch to linear virtual mapped sparse p2m list · 054954eb
      Juergen Gross authored
      At start of the day the Xen hypervisor presents a contiguous mfn list
      to a pv-domain. In order to support sparse memory this mfn list is
      accessed via a three level p2m tree built early in the boot process.
      Whenever the system needs the mfn associated with a pfn this tree is
      used to find the mfn.
      
      Instead of using a software walked tree for accessing a specific mfn
      list entry this patch is creating a virtual address area for the
      entire possible mfn list including memory holes. The holes are
      covered by mapping a pre-defined  page consisting only of "invalid
      mfn" entries. Access to a mfn entry is possible by just using the
      virtual base address of the mfn list and the pfn as index into that
      list. This speeds up the (hot) path of determining the mfn of a
      pfn.
      
      Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0
      showed following improvements:
      
      Elapsed time: 32:50 ->  32:35
      System:       18:07 ->  17:47
      User:        104:00 -> 103:30
      
      Tested with following configurations:
      - 64 bit dom0, 8GB RAM
      - 64 bit dom0, 128 GB RAM, PCI-area above 4 GB
      - 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without
                                          the patch)
      - 32 bit domU, ballooning up and down
      - 32 bit domU, save and restore
      - 32 bit domU with PCI passthrough
      - 64 bit domU, 8 GB, 2049 MB, 5000 MB
      - 64 bit domU, ballooning up and down
      - 64 bit domU, save and restore
      - 64 bit domU with PCI passthrough
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      054954eb
    • Juergen Gross's avatar
      xen: Delay invalidating extra memory · 5b8e7d80
      Juergen Gross authored
      When the physical memory configuration is initialized the p2m entries
      for not pouplated memory pages are set to "invalid". As those pages
      are beyond the hypervisor built p2m list the p2m tree has to be
      extended.
      
      This patch delays processing the extra memory related p2m entries
      during the boot process until some more basic memory management
      functions are callable. This removes the need to create new p2m
      entries until virtual memory management is available.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      5b8e7d80
    • Juergen Gross's avatar
      xen: Delay remapping memory of pv-domain · 1f3ac86b
      Juergen Gross authored
      Early in the boot process the memory layout of a pv-domain is changed
      to match the E820 map (either the host one for Dom0 or the Xen one)
      regarding placement of RAM and PCI holes. This requires removing memory
      pages initially located at positions not suitable for RAM and adding
      them later at higher addresses where no restrictions apply.
      
      To be able to operate on the hypervisor supported p2m list until a
      virtual mapped linear p2m list can be constructed, remapping must
      be delayed until virtual memory management is initialized, as the
      initial p2m list can't be extended unlimited at physical memory
      initialization time due to it's fixed structure.
      
      A further advantage is the reduction in complexity and code volume as
      we don't have to be careful regarding memory restrictions during p2m
      updates.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      1f3ac86b
  13. 16 Nov, 2014 1 commit
  14. 18 Jul, 2014 1 commit
    • Daniel Kiper's avatar
      arch/x86/xen: Silence compiler warnings · c7341d6a
      Daniel Kiper authored
      Compiler complains in the following way when x86 32-bit kernel
      with Xen support is build:
      
        CC      arch/x86/xen/enlighten.o
      arch/x86/xen/enlighten.c: In function ‘xen_start_kernel’:
      arch/x86/xen/enlighten.c:1726:3: warning: right shift count >= width of type [enabled by default]
      
      Such line contains following EFI initialization code:
      
      boot_params.efi_info.efi_systab_hi = (__u32)(__pa(efi_systab_xen) >> 32);
      
      There is no issue if x86 64-bit kernel is build. However, 32-bit case
      generate warning (even if that code will not be executed because Xen
      does not work on 32-bit EFI platforms) due to __pa() returning unsigned long
      type which has 32-bits width. So move whole EFI initialization stuff
      to separate function and build it conditionally to avoid above mentioned
      warning on x86 32-bit architecture.
      Signed-off-by: default avatarDaniel Kiper <daniel.kiper@oracle.com>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      c7341d6a
  15. 05 Jun, 2014 1 commit
  16. 12 May, 2014 1 commit
  17. 21 Jan, 2014 1 commit
    • Roger Pau Monne's avatar
      xen/pvh: Set X86_CR0_WP and others in CR0 (v2) · c9f6e997
      Roger Pau Monne authored
      otherwise we will get for some user-space applications
      that use 'clone' with CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID
      end up hitting an assert in glibc manifested by:
      
      general protection ip:7f80720d364c sp:7fff98fd8a80 error:0 in
      libc-2.13.so[7f807209e000+180000]
      
      This is due to the nature of said operations which sets and clears
      the PID.  "In the successful one I can see that the page table of
      the parent process has been updated successfully to use a
      different physical page, so the write of the tid on
      that page only affects the child...
      
      On the other hand, in the failed case, the write seems to happen before
      the copy of the original page is done, so both the parent and the child
      end up with the same value (because the parent copies the page after
      the write of the child tid has already happened)."
      (Roger's analysis). The nature of this is due to the Xen's commit
      of 51e2cac257ec8b4080d89f0855c498cbbd76a5e5
      "x86/pvh: set only minimal cr0 and cr4 flags in order to use paging"
      the CR0_WP was removed so COW features of the Linux kernel were not
      operating properly.
      
      While doing that also update the rest of the CR0 flags to be inline
      with what a baremetal Linux kernel would set them to.
      
      In 'secondary_startup_64' (baremetal Linux) sets:
      
      X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP |
      X86_CR0_AM | X86_CR0_PG
      
      The hypervisor for HVM type guests (which PVH is a bit) sets:
      X86_CR0_PE | X86_CR0_ET | X86_CR0_TS
      For PVH it specifically sets:
      X86_CR0_PG
      
      Which means we need to set the rest: X86_CR0_MP | X86_CR0_NE  |
      X86_CR0_WP | X86_CR0_AM to have full parity.
      Signed-off-by: default avatarRoger Pau Monne <roger.pau@citrix.com>
      Signed-off-by: default avatarMukesh Rathor <mukesh.rathor@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      [v1: Took out the cr4 writes to be a seperate patch]
      [v2: 0-DAY kernel found xen_setup_gdt to be missing a static]
      c9f6e997
  18. 06 Jan, 2014 1 commit
    • Mukesh Rathor's avatar
      xen/pvh: Secondary VCPU bringup (non-bootup CPUs) · 5840c84b
      Mukesh Rathor authored
      The VCPU bringup protocol follows the PV with certain twists.
      From xen/include/public/arch-x86/xen.h:
      
      Also note that when calling DOMCTL_setvcpucontext and VCPU_initialise
      for HVM and PVH guests, not all information in this structure is updated:
      
       - For HVM guests, the structures read include: fpu_ctxt (if
       VGCT_I387_VALID is set), flags, user_regs, debugreg[*]
      
       - PVH guests are the same as HVM guests, but additionally use ctrlreg[3] to
       set cr3. All other fields not used should be set to 0.
      
      This is what we do. We piggyback on the 'xen_setup_gdt' - but modify
      a bit - we need to call 'load_percpu_segment' so that 'switch_to_new_gdt'
      can load per-cpu data-structures. It has no effect on the VCPU0.
      
      We also piggyback on the %rdi register to pass in the CPU number - so
      that when we bootup a new CPU, the cpu_bringup_and_idle will have
      passed as the first parameter the CPU number (via %rdi for 64-bit).
      Signed-off-by: default avatarMukesh Rathor <mukesh.rathor@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      5840c84b
  19. 22 Aug, 2013 1 commit
  20. 06 Aug, 2013 1 commit
  21. 14 Jul, 2013 1 commit
    • Paul Gortmaker's avatar
      x86: delete __cpuinit usage from all x86 files · 148f9bb8
      Paul Gortmaker authored
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/x86 uses of the __cpuinit macros from
      all C files.  x86 only had the one __CPUINIT used in assembly files,
      and it wasn't paired off with a .previous or a __FINIT, so we can
      delete it directly w/o any corresponding additional change there.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      148f9bb8
  22. 15 Feb, 2013 1 commit
  23. 02 Nov, 2012 1 commit
    • Olaf Hering's avatar
      xen PVonHVM: use E820_Reserved area for shared_info · 9d02b43d
      Olaf Hering authored
      This is a respin of 00e37bdb
      ("xen PVonHVM: move shared_info to MMIO before kexec").
      
      Currently kexec in a PVonHVM guest fails with a triple fault because the
      new kernel overwrites the shared info page. The exact failure depends on
      the size of the kernel image. This patch moves the pfn from RAM into an
      E820 reserved memory area.
      
      The pfn containing the shared_info is located somewhere in RAM. This will
      cause trouble if the current kernel is doing a kexec boot into a new
      kernel. The new kernel (and its startup code) can not know where the pfn
      is, so it can not reserve the page. The hypervisor will continue to update
      the pfn, and as a result memory corruption occours in the new kernel.
      
      The toolstack marks the memory area FC000000-FFFFFFFF as reserved in the
      E820 map. Within that range newer toolstacks (4.3+) will keep 1MB
      starting from FE700000 as reserved for guest use. Older Xen4 toolstacks
      will usually not allocate areas up to FE700000, so FE700000 is expected
      to work also with older toolstacks.
      
      In Xen3 there is no reserved area at a fixed location. If the guest is
      started on such old hosts the shared_info page will be placed in RAM. As
      a result kexec can not be used.
      Signed-off-by: default avatarOlaf Hering <olaf@aepfle.de>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      9d02b43d
  24. 23 Aug, 2012 2 commits
    • Konrad Rzeszutek Wilk's avatar
      xen/p2m: Add logic to revector a P2M tree to use __va leafs. · 357a3cfb
      Konrad Rzeszutek Wilk authored
      During bootup Xen supplies us with a P2M array. It sticks
      it right after the ramdisk, as can be seen with a 128GB PV guest:
      
      (certain parts removed for clarity):
      xc_dom_build_image: called
      xc_dom_alloc_segment:   kernel       : 0xffffffff81000000 -> 0xffffffff81e43000  (pfn 0x1000 + 0xe43 pages)
      xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0xe43 at 0x7f097d8bf000
      xc_dom_alloc_segment:   ramdisk      : 0xffffffff81e43000 -> 0xffffffff925c7000  (pfn 0x1e43 + 0x10784 pages)
      xc_dom_pfn_to_ptr: domU mapping: pfn 0x1e43+0x10784 at 0x7f0952dd2000
      xc_dom_alloc_segment:   phys2mach    : 0xffffffff925c7000 -> 0xffffffffa25c7000  (pfn 0x125c7 + 0x10000 pages)
      xc_dom_pfn_to_ptr: domU mapping: pfn 0x125c7+0x10000 at 0x7f0942dd2000
      xc_dom_alloc_page   :   start info   : 0xffffffffa25c7000 (pfn 0x225c7)
      xc_dom_alloc_page   :   xenstore     : 0xffffffffa25c8000 (pfn 0x225c8)
      xc_dom_alloc_page   :   console      : 0xffffffffa25c9000 (pfn 0x225c9)
      nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
      nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
      nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
      nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffffa27fffff, 276 table(s)
      xc_dom_alloc_segment:   page tables  : 0xffffffffa25ca000 -> 0xffffffffa26e1000  (pfn 0x225ca + 0x117 pages)
      xc_dom_pfn_to_ptr: domU mapping: pfn 0x225ca+0x117 at 0x7f097d7a8000
      xc_dom_alloc_page   :   boot stack   : 0xffffffffa26e1000 (pfn 0x226e1)
      xc_dom_build_image  : virt_alloc_end : 0xffffffffa26e2000
      xc_dom_build_image  : virt_pgtab_end : 0xffffffffa2800000
      
      So the physical memory and virtual (using __START_KERNEL_map addresses)
      layout looks as so:
      
        phys                             __ka
      /------------\                   /-------------------\
      | 0          | empty             | 0xffffffff80000000|
      | ..         |                   | ..                |
      | 16MB       | <= kernel starts  | 0xffffffff81000000|
      | ..         |                   |                   |
      | 30MB       | <= kernel ends => | 0xffffffff81e43000|
      | ..         |  & ramdisk starts | ..                |
      | 293MB      | <= ramdisk ends=> | 0xffffffff925c7000|
      | ..         |  & P2M starts     | ..                |
      | ..         |                   | ..                |
      | 549MB      | <= P2M ends    => | 0xffffffffa25c7000|
      | ..         | start_info        | 0xffffffffa25c7000|
      | ..         | xenstore          | 0xffffffffa25c8000|
      | ..         | cosole            | 0xffffffffa25c9000|
      | 549MB      | <= page tables => | 0xffffffffa25ca000|
      | ..         |                   |                   |
      | 550MB      | <= PGT end     => | 0xffffffffa26e1000|
      | ..         | boot stack        |                   |
      \------------/                   \-------------------/
      
      As can be seen, the ramdisk, P2M and pagetables are taking
      a bit of __ka addresses space. Which is a problem since the
      MODULES_VADDR starts at 0xffffffffa0000000 - and P2M sits
      right in there! This results during bootup with the inability to
      load modules, with this error:
      
      ------------[ cut here ]------------
      WARNING: at /home/konrad/ssd/linux/mm/vmalloc.c:106 vmap_page_range_noflush+0x2d9/0x370()
      Call Trace:
       [<ffffffff810719fa>] warn_slowpath_common+0x7a/0xb0
       [<ffffffff81030279>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
       [<ffffffff81071a45>] warn_slowpath_null+0x15/0x20
       [<ffffffff81130b89>] vmap_page_range_noflush+0x2d9/0x370
       [<ffffffff81130c4d>] map_vm_area+0x2d/0x50
       [<ffffffff811326d0>] __vmalloc_node_range+0x160/0x250
       [<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80
       [<ffffffff810c6186>] ? load_module+0x66/0x19c0
       [<ffffffff8105cadc>] module_alloc+0x5c/0x60
       [<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80
       [<ffffffff810c5369>] module_alloc_update_bounds+0x19/0x80
       [<ffffffff810c70c3>] load_module+0xfa3/0x19c0
       [<ffffffff812491f6>] ? security_file_permission+0x86/0x90
       [<ffffffff810c7b3a>] sys_init_module+0x5a/0x220
       [<ffffffff815ce339>] system_call_fastpath+0x16/0x1b
      ---[ end trace fd8f7704fdea0291 ]---
      vmalloc: allocation failure, allocated 16384 of 20480 bytes
      modprobe: page allocation failure: order:0, mode:0xd2
      
      Since the __va and __ka are 1:1 up to MODULES_VADDR and
      cleanup_highmap rids __ka of the ramdisk mapping, what
      we want to do is similar - get rid of the P2M in the __ka
      address space. There are two ways of fixing this:
      
       1) All P2M lookups instead of using the __ka address would
          use the __va address. This means we can safely erase from
          __ka space the PMD pointers that point to the PFNs for
          P2M array and be OK.
       2). Allocate a new array, copy the existing P2M into it,
          revector the P2M tree to use that, and return the old
          P2M to the memory allocate. This has the advantage that
          it sets the stage for using XEN_ELF_NOTE_INIT_P2M
          feature. That feature allows us to set the exact virtual
          address space we want for the P2M - and allows us to
          boot as initial domain on large machines.
      
      So we pick option 2).
      
      This patch only lays the groundwork in the P2M code. The patch
      that modifies the MMU is called "xen/mmu: Copy and revector the P2M tree."
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      357a3cfb
    • Konrad Rzeszutek Wilk's avatar
      xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything. · 3699aad0
      Konrad Rzeszutek Wilk authored
      We don't need to return the new PGD - as we do not use it.
      Acked-by: default avatarStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      3699aad0
  25. 16 Aug, 2012 1 commit
  26. 14 Sep, 2012 1 commit
  27. 19 Jul, 2012 1 commit
    • Olaf Hering's avatar
      xen PVonHVM: move shared_info to MMIO before kexec · 00e37bdb
      Olaf Hering authored
      Currently kexec in a PVonHVM guest fails with a triple fault because the
      new kernel overwrites the shared info page. The exact failure depends on
      the size of the kernel image. This patch moves the pfn from RAM into
      MMIO space before the kexec boot.
      
      The pfn containing the shared_info is located somewhere in RAM. This
      will cause trouble if the current kernel is doing a kexec boot into a
      new kernel. The new kernel (and its startup code) can not know where the
      pfn is, so it can not reserve the page. The hypervisor will continue to
      update the pfn, and as a result memory corruption occours in the new
      kernel.
      
      One way to work around this issue is to allocate a page in the
      xen-platform pci device's BAR memory range. But pci init is done very
      late and the shared_info page is already in use very early to read the
      pvclock. So moving the pfn from RAM to MMIO is racy because some code
      paths on other vcpus could access the pfn during the small   window when
      the old pfn is moved to the new pfn. There is even a  small window were
      the old pfn is not backed by a mfn, and during that time all reads
      return -1.
      
      Because it is not known upfront where the MMIO region is located it can
      not be used right from the start in xen_hvm_init_shared_info.
      
      To minimise trouble the move of the pfn is done shortly before kexec.
      This does not eliminate the race because all vcpus are still online when
      the syscore_ops will be called. But hopefully there is no work pending
      at this point in time. Also the syscore_op is run last which reduces the
      risk further.
      Signed-off-by: default avatarOlaf Hering <olaf@aepfle.de>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      00e37bdb
  28. 07 May, 2012 1 commit
    • David Vrabel's avatar
      xen/setup: update VA mapping when releasing memory during setup · 83d51ab4
      David Vrabel authored
      In xen_memory_setup(), if a page that is being released has a VA
      mapping this must also be updated.  Otherwise, the page will be not
      released completely -- it will still be referenced in Xen and won't be
      freed util the mapping is removed and this prevents it from being
      reallocated at a different PFN.
      
      This was already being done for the ISA memory region in
      xen_ident_map_ISA() but on many systems this was omitting a few pages
      as many systems marked a few pages below the ISA memory region as
      reserved in the e820 map.
      
      This fixes errors such as:
      
      (XEN) page_alloc.c:1148:d0 Over-allocation for domain 0: 2097153 > 2097152
      (XEN) memory.c:133:d0 Could not allocate order=0 extent: id=0 memflags=0 (0 of 17)
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      83d51ab4
  29. 01 May, 2012 1 commit