1. 06 Sep, 2012 3 commits
    • Dan Magenheimer's avatar
      staging: ramster: place ramster codebase on top of new zcache2 codebase · 14c9fda5
      Dan Magenheimer authored
      [V2: rebased to apply to 20120905 staging-next, no other changes]
      
      This slightly modified ramster codebase is now built entirely on zcache2
      and all ramster-specific code is fully contained in a subdirectory.
      
      Ramster extends zcache2 to allow pages compressed via zcache2 to be
      "load-balanced" across machines in a cluster.  Control and data communication
      is done via kernel sockets, and cluster configuration and management is
      heavily leveraged from the ocfs2 cluster filesystem.
      
      There are no new features since the codebase introduced into staging at 3.4.
      Some cleanup was performed though:
       1) Interfaces directly with new zbud
       2) Debugfs now used instead of sysfs where possible.  Sysfs still
          used where necessary for userland cluster configuration.
      
      Ramster is very much a work-in-progress but also does really work!
      
      RAMSTER HIGH LEVEL OVERVIEW (from original V5 posting in Feb 2012)
      
      RAMster implements peer-to-peer transcendent memory, allowing a "cluster" of
      kernels to dynamically pool their RAM so that a RAM-hungry workload on one
      machine can temporarily and transparently utilize RAM on another machine which
      is presumably idle or running a non-RAM-hungry workload.  Other than the
      already-merged cleancache patchset and frontswap patchset, no core kernel
      changes are currently required.
      
      (Note that, unlike previous public descriptions of RAMster, this implementation
      does NOT require synchronous "gets" or core networking changes. As of V5,
      it also co-exists with ocfs2.)
      
      RAMster combines a clustering and messaging foundation based on the ocfs2
      cluster layer with the in-kernel compression implementation of zcache2, and
      adds code to glue them together.  When a page is "put" to RAMster, it is
      compressed and stored locally.  Periodically, a thread will "remotify" these
      pages by sending them via messages to a remote machine.  When the page is
      later needed as indicated by a page fault, a "get" is issued.  If the data
      is local, it is uncompressed and the fault is resolved.  If the data is
      remote, a message is sent to fetch the data and the faulting thread sleeps;
      when the data arrives, the thread awakens, the data is decompressed and
      the fault is resolved.
      
      As of V5, clusters up to eight nodes are supported; each node can remotify
      pages to one specified node, so clusters can be configured as clients to
      a "memory server".  Some simple policy is in place that will need to be
      refined over time.  Larger clusters and fault-resistant protocols can also
      be added over time.
      
      A HOW-TO is available at:
      http://oss.oracle.com/projects/tmem/dist/files/RAMster/HOWTO-120817Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14c9fda5
    • Dan Magenheimer's avatar
      staging: ramster: move to new zcache2 codebase · faca2ef7
      Dan Magenheimer authored
      [V2: rebased to apply to 20120905 staging-next, no other changes]
      
      The original zcache in staging is a "demo" version, and this is a massive
      rewrite.  This was intended to result in a merged zcache and ramster, but
      that option has been blocked so, to continue forward progress on ramster
      and future related projects, only ramster moves to the new codebase.
      To differentiate between the old demo zcache and the rewrite, we refer
      to the latter as zcache2, config'd as CONFIG_ZCACHE2.  Zcache and zcache2
      cannot be built in the same kernel, so CONFIG_ZCACHE2 implies !CONFIG_ZCACHE.
      
      This developer still has hope that zcache and zcache2 will be merged
      into one codebase.  Until then, zcache2 can be considered a one-node
      version of ramster.
      
      No history of changes was recorded during the zcache2 rewrite and recreating
      a sane one would be a Sisyphean task but, since ramster is still in
      staging and has been unchanged since it was merged, presumably this
      is acceptable.
      
      This commit also provides the hooks in zcache2 for ramster, but all
      ramster-specific code is provided in a separate commit.
      
      Some of the highlights of this rewritten codebase for zcache2:
      (Note: If you are not familiar with the tmem terminology, you can review
      it here: http://lwn.net/Articles/454795/ )
       1. Merge of "demo" zcache and the v1.1 version of zcache in ramster.  Zcache
          and ramster had a great deal of duplicate code which is now merged.
          In essence, zcache2 *is* ramster but with no remote machine available,
          but !CONFIG_RAMSTER will avoid compiling lots of ramster-specific code.
       2. Allocator.  Previously, persistent pools used zsmalloc and ephemeral pools
          used zbud.  Now a completely rewritten zbud is used for both.  Notably
          this zbud maintains all persistent (frontswap) and ephemeral (cleancache)
          pageframes in separate queues in LRU order.
       3. Interaction with page allocator.  Zbud does no page allocation/freeing,
          it is done entirely in zcache2 where it can be tracked more effectively.
       4. Better pre-allocation.  Previously, on put, if a new pageframe could not be
          pre-allocated, the put would fail, even if the allocator had plenty of
          partial pages where the data could be stored; this is now fixed.
       5. Ouroboros ("eating its own tail") allocation.  If no pageframe can be
          allocated AND no partial pages are available, the least-recently-used
          ephemeral pageframe is reclaimed immediately (including flushing tmem
          pointers to it) and re-used.  This ensures that most-recently-used
          cleancache pages are more likely to be retained than LRU pages and also
          that, as in the core mm subsystem, anonymous pages have a higher priority
          than clean page cache pages.
       6. Zcache and zbud now use debugfs instead of sysfs.  Ramster uses debugfs
          where possible and sysfs where necessary.  (Some ramster configuration
          is done from userspace so some sysfs is necessary.)
       7. Modularization.  As some have observed, the monolithic zcache-main.c code
          included zbud code, which has now been separated into its own code module.
          Much ramster-specific code in the old ramster zcache-main.c has also been
          moved into ramster.c so that it does not get compiled with !CONFIG_RAMSTER.
       8. Rebased to 3.5.
      
      This new codebase also provides hooks for several future new features:
       A. WasActive patch, requires some mm/frontswap changes previously posted.
          A new version of this patch will be provided separately.
          See ifdef __PG_WAS_ACTIVE
       B. Exclusive gets.  It seems tmem _can_ support exclusive gets with a
          minor change to both zcache2 and a small backwards-compatible change
          to frontswap.c.  Explanation and frontswap patch will be provided
          separately.  See ifdef FRONTSWAP_HAS_EXCLUSIVE_GETS
       C. Ouroboros writeback.  Since persistent (frontswap) pages may now also be
          reclaimed in LRU order, the foundation is in place to properly writeback
          these pages back into the swap cache and then the swap disk.  This is still
          under development and requires some other mm changes which are prototyped.
          See ifdef FRONTSWAP_HAS_UNUSE.
      
      A new feature that desperately needs attention (if someone is looking for
      a way to contribute) is kernel module support.  A preliminary version of
      a patch was posted by Erlangen University and needs to be integrated and
      tested for zcache2 and brought up to kernel standards.
      
      If anybody is interested on helping out with any of these, let me know!
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      faca2ef7
    • Dan Magenheimer's avatar
      staging: ramster: remove old driver to prep for new base · c857ce16
      Dan Magenheimer authored
      [V2: rebased to apply to 20120905 staging-next, no other changes]
      
      To prep for moving the ramster codebase on top of the new
      redesigned zcache2 codebase, we remove ramster (as well
      as its contained diverged v1.1 version of zcache) entirely.
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c857ce16
  2. 05 Sep, 2012 10 commits
  3. 04 Sep, 2012 27 commits