An error occurred fetching the project authors.
  1. 08 Nov, 2007 1 commit
  2. 29 Oct, 2007 1 commit
    • Benjamin Herrenschmidt's avatar
      [POWERPC] powerpc: Fix demotion of segments to 4K pages · f6ab0b92
      Benjamin Herrenschmidt authored
      When demoting a process to use 4K HW pages (instead of 64K), which
      happens under various circumstances such as doing cache inhibited
      mappings on machines that do not support 64K CI pages, the assembly
      hash code calls back into the C function flush_hash_page().  This
      function prototype was recently changed to accomodate for 1T segments
      but the assembly call site was not updated, causing applications that
      do demotion to hang.  In addition, when updating the per-CPU PACA for
      the new sizes, we didn't properly update the slice "map", thus causing
      the SLB miss code to re-insert segments for the wrong size.
      
      This fixes both and adds a warning comment next to the C
      implementation to try to avoid problems next time someone changes it.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      f6ab0b92
  3. 17 Oct, 2007 2 commits
  4. 12 Oct, 2007 1 commit
    • Paul Mackerras's avatar
      [POWERPC] Use 1TB segments · 1189be65
      Paul Mackerras authored
      This makes the kernel use 1TB segments for all kernel mappings and for
      user addresses of 1TB and above, on machines which support them
      (currently POWER5+, POWER6 and PA6T).
      
      We detect that the machine supports 1TB segments by looking at the
      ibm,processor-segment-sizes property in the device tree.
      
      We don't currently use 1TB segments for user addresses < 1T, since
      that would effectively prevent 32-bit processes from using huge pages
      unless we also had a way to revert to using 256MB segments.  That
      would be possible but would involve extra complications (such as
      keeping track of which segment size was used when HPTEs were inserted)
      and is not addressed here.
      
      Parts of this patch were originally written by Ben Herrenschmidt.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      1189be65
  5. 17 Aug, 2007 3 commits
  6. 03 Aug, 2007 1 commit
    • Michael Neuling's avatar
      [POWERPC] Fixes for the SLB shadow buffer code · 67439b76
      Michael Neuling authored
      On a machine with hardware 64kB pages and a kernel configured for a
      64kB base page size, we need to change the vmalloc segment from 64kB
      pages to 4kB pages if some driver creates a non-cacheable mapping in
      the vmalloc area.  However, we never updated with SLB shadow buffer.
      This fixes it.  Thanks to paulus for finding this.
      
      Also added some write barriers to ensure the shadow buffer contents
      are always consistent.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      67439b76
  7. 22 Jul, 2007 1 commit
  8. 14 Jun, 2007 1 commit
  9. 17 May, 2007 1 commit
  10. 09 May, 2007 3 commits
    • Benjamin Herrenschmidt's avatar
      [POWERPC] Add ability to 4K kernel to hash in 64K pages · 16c2d476
      Benjamin Herrenschmidt authored
      This adds the ability for a kernel compiled with 4K page size
      to have special slices containing 64K pages and hash the right type
      of hash PTEs.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      16c2d476
    • Benjamin Herrenschmidt's avatar
      [POWERPC] Introduce address space "slices" · d0f13e3c
      Benjamin Herrenschmidt authored
      The basic issue is to be able to do what hugetlbfs does but with
      different page sizes for some other special filesystems; more
      specifically, my need is:
      
       - Huge pages
      
       - SPE local store mappings using 64K pages on a 4K base page size
      kernel on Cell
      
       - Some special 4K segments in 64K-page kernels for mapping a dodgy
      type of powerpc-specific infiniband hardware that requires 4K MMU
      mappings for various reasons I won't explain here.
      
      The main issues are:
      
       - To maintain/keep track of the page size per "segment" (as we can
      only have one page size per segment on powerpc, which are 256MB
      divisions of the address space).
      
       - To make sure special mappings stay within their allotted
      "segments" (including MAP_FIXED crap)
      
       - To make sure everybody else doesn't mmap/brk/grow_stack into a
      "segment" that is used for a special mapping
      
      Some of the necessary mechanisms to handle that were present in the
      hugetlbfs code, but mostly in ways not suitable for anything else.
      
      The patch relies on some changes to the generic get_unmapped_area()
      that just got merged.  It still hijacks hugetlb callbacks here or
      there as the generic code hasn't been entirely cleaned up yet but
      that shouldn't be a problem.
      
      So what is a slice ?  Well, I re-used the mechanism used formerly by our
      hugetlbfs implementation which divides the address space in
      "meta-segments" which I called "slices".  The division is done using
      256MB slices below 4G, and 1T slices above.  Thus the address space is
      divided currently into 16 "low" slices and 16 "high" slices.  (Special
      case: high slice 0 is the area between 4G and 1T).
      
      Doing so simplifies significantly the tracking of segments and avoids
      having to keep track of all the 256MB segments in the address space.
      
      While I used the "concepts" of hugetlbfs, I mostly re-implemented
      everything in a more generic way and "ported" hugetlbfs to it.
      
      Slices can have an associated page size, which is encoded in the mmu
      context and used by the SLB miss handler to set the segment sizes.  The
      hash code currently doesn't care, it has a specific check for hugepages,
      though I might add a mechanism to provide per-slice hash mapping
      functions in the future.
      
      The slice code provide a pair of "generic" get_unmapped_area() (bottomup
      and topdown) functions that should work with any slice size.  There is
      some trickiness here so I would appreciate people to have a look at the
      implementation of these and let me know if I got something wrong.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      d0f13e3c
    • Benjamin Herrenschmidt's avatar
      [POWERPC] Small fixes & cleanups in segment page size demotion · 16f1c746
      Benjamin Herrenschmidt authored
      The code for demoting segments to 4K had some issues, like for example,
      when using _PAGE_4K_PFN flag, the first CPU to hit it would do the
      demotion, but other CPUs hitting the same page wouldn't properly flush
      their SLBs if mmu_ci_restriction isn't set.  There are also potential
      issues with hash_preload not handling _PAGE_4K_PFN.  All of these are
      non issues on current hardware but might bite us in the future.
      
      This patch thus fixes it by:
      
       - Taking the test comparing the mm and current CPU context page
      sizes to decide to flush SLBs out of the mmu_ci_restrictions test
      since that can also be triggered by _PAGE_4K_PFN pages
      
       - Due to the above being done all the time, demote_segment_4k
      doesn't need update the context and flush the SLB
      
       - demote_segment_4k can be static and doesn't need an EXPORT_SYMBOL
      
       - Making hash_preload ignore anything that has either _PAGE_4K_PFN
      or _PAGE_NO_CACHE set, thus avoiding duplication of the complicated
      logic in hash_page() (and possibly making hash_preload a little bit
      faster for the normal case).
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      16f1c746
  11. 02 May, 2007 1 commit
    • Michael Ellerman's avatar
      [POWERPC] Initialise spinlock in the DEBUG_PAGEALLOC code · ed166692
      Michael Ellerman authored
      Fixes:
      
      BUG: spinlock bad magic on CPU#0, swapper/0
       lock: c00000000064ec30, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
      Call Trace:
      [c00000000062b980] [c00000000000f920] .show_stack+0x6c/0x1a0 (unreliable)
      [c00000000062ba20] [c0000000001c2b40] .spin_bug+0xb0/0xd4
      [c00000000062bab0] [c0000000001c2ed0] ._raw_spin_lock+0x44/0x184
      [c00000000062bb50] [c0000000003a42b4] ._spin_lock+0x10/0x24
      [c00000000062bbd0] [c00000000002b4dc] .kernel_map_pages+0x198/0x278
      [c00000000062bc90] [c000000000079720] .free_hot_cold_page+0x124/0x418
      [c00000000062bd70] [c000000000530278] .free_all_bootmem_core+0x14c/0x224
      [c00000000062be50] [c00000000052a178] .mem_init+0x68/0x170
      [c00000000062bee0] [c00000000051d874] .start_kernel+0x2a0/0x37c
      [c00000000062bf90] [c0000000000084c8] .start_here_common+0x54/0x8c
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      ed166692
  12. 12 Apr, 2007 2 commits
    • Benjamin Herrenschmidt's avatar
      [POWERPC] DEBUG_PAGEALLOC for 64-bit · 370a908d
      Benjamin Herrenschmidt authored
      Here's an implementation of DEBUG_PAGEALLOC for 64 bits powerpc.
      It applies on top of the 32 bits patch.
      
      Unlike Anton's previous attempt, I'm not using updatepp. I'm removing
      the hash entries from the bolted mapping (using a map in RAM of all the
      slots). Expensive but it doesn't really matter, does it ? :-)
      
      Memory hot-added doesn't benefit from this unless it's added at an
      address that is below end_of_DRAM() as calculated at boot time.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      
       arch/powerpc/Kconfig.debug      |    2
       arch/powerpc/mm/hash_utils_64.c |   84 ++++++++++++++++++++++++++++++++++++++--
       2 files changed, 82 insertions(+), 4 deletions(-)
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      370a908d
    • Paul Mackerras's avatar
      [POWERPC] Allow drivers to map individual 4k pages to userspace · 721151d0
      Paul Mackerras authored
      Some drivers have resources that they want to be able to map into
      userspace that are 4k in size.  On a kernel configured with 64k pages
      we currently end up mapping the 4k we want plus another 60k of
      physical address space, which could contain anything.  This can
      introduce security problems, for example in the case of an infiniband
      adaptor where the other 60k could contain registers that some other
      program is using for its communications.
      
      This patch adds a new function, remap_4k_pfn, which drivers can use to
      map a single 4k page to userspace regardless of whether the kernel is
      using a 4k or a 64k page size.  Like remap_pfn_range, it would
      typically be called in a driver's mmap function.  It only maps a
      single 4k page, which on a 64k page kernel appears replicated 16 times
      throughout a 64k page.  On a 4k page kernel it reduces to a call to
      remap_pfn_range.
      
      The way this works on a 64k kernel is that a new bit, _PAGE_4K_PFN,
      gets set on the linux PTE.  This alters the way that __hash_page_4K
      computes the real address to put in the HPTE.  The RPN field of the
      linux PTE becomes the 4k RPN directly rather than being interpreted as
      a 64k RPN.  Since the RPN field is 32 bits, this means that physical
      addresses being mapped with remap_4k_pfn have to be below 2^44,
      i.e. 0x100000000000.
      
      The patch also factors out the code in arch/powerpc/mm/hash_utils_64.c
      that deals with demoting a process to use 4k pages into one function
      that gets called in the various different places where we need to do
      that.  There were some discrepancies between exactly what was done in
      the various places, such as a call to spu_flush_all_slbs in one case
      but not in others.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      721151d0
  13. 09 Mar, 2007 1 commit
    • Benjamin Herrenschmidt's avatar
      [POWERPC] Fix spu SLB invalidations · 94b2a439
      Benjamin Herrenschmidt authored
      The SPU code doesn't properly invalidate SPUs SLBs when necessary,
      for example when changing a segment size from the hugetlbfs code. In
      addition, it saves and restores the SLB content on context switches
      which makes it harder to properly handle those invalidations.
      
      This patch removes the saving & restoring for now, something more
      efficient might be found later on. It also adds a spu_flush_all_slbs(mm)
      that can be used by the core mm code to flush the SLBs of all SPEs that
      are running a given mm at the time of the flush.
      
      In order to do that, it adds a spinlock to the list of all SPEs and move
      some bits & pieces from spufs to spu_base.c
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      94b2a439
  14. 04 Dec, 2006 1 commit
  15. 30 Jun, 2006 1 commit
  16. 28 Jun, 2006 2 commits
  17. 15 Jun, 2006 1 commit
    • Paul Mackerras's avatar
      powerpc: Use 64k pages without needing cache-inhibited large pages · bf72aeba
      Paul Mackerras authored
      Some POWER5+ machines can do 64k hardware pages for normal memory but
      not for cache-inhibited pages.  This patch lets us use 64k hardware
      pages for most user processes on such machines (assuming the kernel
      has been configured with CONFIG_PPC_64K_PAGES=y).  User processes
      start out using 64k pages and get switched to 4k pages if they use any
      non-cacheable mappings.
      
      With this, we use 64k pages for the vmalloc region and 4k pages for
      the imalloc region.  If anything creates a non-cacheable mapping in
      the vmalloc region, the vmalloc region will get switched to 4k pages.
      I don't know of any driver other than the DRM that would do this,
      though, and these machines don't have AGP.
      
      When a region gets switched from 64k pages to 4k pages, we do not have
      to clear out all the 64k HPTEs from the hash table immediately.  We
      use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
      was hashed in as a 64k page or a set of 4k pages.  If hash_page is
      trying to insert a 4k page for a Linux PTE and it sees that it has
      already been inserted as a 64k page, it first invalidates the 64k HPTE
      before inserting the 4k HPTE.  The hash invalidation routines also use
      the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
      set of 4k HPTEs to remove.  With those two changes, we can tolerate a
      mix of 4k and 64k HPTEs in the hash table, and they will all get
      removed when the address space is torn down.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      bf72aeba
  18. 22 Apr, 2006 1 commit
  19. 28 Mar, 2006 1 commit
  20. 22 Mar, 2006 2 commits
    • Michael Ellerman's avatar
      [PATCH] powerpc: Replace platform_is_lpar() with a firmware feature · 57cfb814
      Michael Ellerman authored
      It has been decreed that platform numbers are evil, so as a step in that
      direction, replace platform_is_lpar() with a FW_FEATURE_LPAR bit.
      
      Currently FW_FEATURE_LPAR really means i/pSeries LPAR, in the future we might
      have to clean that up if we need to be more specific about what LPAR actually
      means. But that's another patch ...
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      57cfb814
    • Michael Ellerman's avatar
      [PATCH] powerpc: Unconfuse htab_bolt_mapping() callers · caf80e57
      Michael Ellerman authored
      htab_bolt_mapping() takes a vstart and pstart parameter, but all but one of
      its callers actually pass it vstart and vstart. Luckily before it passes
      paddr (calculated from paddr) to the hpte_insert routines it calls
      virt_to_abs() (aka. __pa()) on the address, so there isn't actually a bug.
      
      map_io_page() however does pass pstart properly, so currently it's broken
      AFAICT because we're calling __pa(paddr) which will get us something very
      large. Presumably no one's calling map_io_page() in the right context.
      
      Anyway, change htab_bolt_mapping() callers to properly pass pstart, and then
      use it properly in htab_bolt_mapping(), ie. don't call __pa() on it again.
      
      Booted on p5 LPAR, iSeries and Power3.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      caf80e57
  21. 28 Feb, 2006 1 commit
    • Michael Ellerman's avatar
      [PATCH] powerpc/iseries: Fix double phys_to_abs bug in htab_bolt_mapping · 56ec6462
      Michael Ellerman authored
      Before the merge I updated create_pte_mapping() to work for iSeries, by
      calling iSeries_hpte_bolt_or_insert. (4c55130b)
      
      Later we changed iSeries_hpte_insert to cope with the bolting case, and called
      that instead from create_pte_mapping() (which was renamed to htab_bolt_mapping)
      (3c726f8d).
      
      Unfortunately that change introduced a subtle bug, where we pass an absolute
      address to iSeries_hpte_insert() where it expects a physical address. This
      leads to us calling phys_to_abs() twice on the physical address, which is
      seriously bogus.
      
      This only causes a problem if the absolute address from the first translation
      can be looked up again in the chunk_map, which depends on the size and layout
      of memory. I've seen it fail on one box, but not others.
      
      The minimal fix is to pass the physical address to iSeries_hpte_insert(). For
      2.6.17 we should make phys_to_abs() BUG if we try to double-translate an
      address.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      56ec6462
  22. 24 Feb, 2006 1 commit
  23. 07 Feb, 2006 1 commit
    • Michael Ellerman's avatar
      [PATCH] powerpc: Always panic if lmb_alloc() fails · d7a5b2ff
      Michael Ellerman authored
      Currently most callers of lmb_alloc() don't check if it worked or not, if it
      ever does weird bad things will probably happen. The few callers who do check
      just panic or BUG_ON.
      
      So make lmb_alloc() panic internally, to catch bugs at the source. The few
      callers who did check the result no longer need to.
      
      The only caller that did anything interesting with the return result was
      careful_allocation(). For it we create __lmb_alloc_base() which _doesn't_ panic
      automatically, a little messy, but passable.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      d7a5b2ff
  24. 09 Jan, 2006 3 commits
  25. 29 Dec, 2005 1 commit
  26. 09 Dec, 2005 1 commit
    • David Gibson's avatar
      [PATCH] powerpc: Add missing icache flushes for hugepages · cbf52afd
      David Gibson authored
      On most powerpc CPUs, the dcache and icache are not coherent so
      between writing and executing a page, the caches must be flushed.
      Userspace programs assume pages given to them by the kernel are icache
      clean, so we must do this flush between the kernel clearing a page and
      it being mapped into userspace for execute.  We were not doing this
      for hugepages, this patch corrects the situation.
      
      We use the same lazy mechanism as we use for normal pages, delaying
      the flush until userspace actually attempts to execute from the page
      in question.
      
      Tested on G5.
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      cbf52afd
  27. 10 Nov, 2005 1 commit
  28. 08 Nov, 2005 2 commits
  29. 07 Nov, 2005 1 commit
    • David Gibson's avatar
      [PATCH] ppc64: Fix bug in SLB miss handler for hugepages · 7d24f0b8
      David Gibson authored
      This patch, however, should be applied on top of the 64k-page-size patch to
      fix some problems with hugepage (some pre-existing, another introduced by
      this patch).
      
      The patch fixes a bug in the SLB miss handler for hugepages on ppc64
      introduced by the dynamic hugepage patch (commit id
      c594adad) due to a misunderstanding of the
      srd instruction's behaviour (mea culpa).  The problem arises when a 64-bit
      process maps some hugepages in the low 4GB of the address space (unusual).
      In this case, as well as the 256M segment in question being marked for
      hugepages, other segments at 32G intervals will be incorrectly marked for
      hugepages.
      
      In the process, this patch tweaks the semantics of the hugepage bitmaps to
      be more sensible.  Previously, an address below 4G was marked for hugepages
      if the appropriate segment bit in the "low areas" bitmask was set *or* if
      the low bit in the "high areas" bitmap was set (which would mark all
      addresses below 1TB for hugepage).  With this patch, any given address is
      governed by a single bitmap.  Addresses below 4GB are marked for hugepage
      if and only if their bit is set in the "low areas" bitmap (256M
      granularity).  Addresses between 4GB and 1TB are marked for hugepage iff
      the low bit in the "high areas" bitmap is set.  Higher addresses are marked
      for hugepage iff their bit in the "high areas" bitmap is set (1TB
      granularity).
      
      To avoid conflicts, this patch must be applied on top of BenH's pending
      patch for 64k base page size [0].  As such, this patch also addresses a
      hugepage problem introduced by that patch.  That patch allows hugepages of
      1MB in size on hardware which supports it, however, that won't work when
      using 4k pages (4 level pagetable), because in that case hugepage PTEs are
      stored at the PMD level, and each PMD entry maps 2MB.  This patch simply
      disallows hugepages in that case (we can do something cleverer to re-enable
      them some other day).
      
      Built, booted, and a handful of hugepage related tests passed on POWER5
      LPAR (both ARCH=powerpc and ARCH=ppc64).
      
      [0] http://gate.crashing.org/~benh/ppc64-64k-pages.diffSigned-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7d24f0b8