Commit 5c401567 authored by ben's avatar ben

First checkin


git-svn-id: http://svn.savannah.nongnu.org/svn/rdiff-backup@2 2b77aa54-bcbc-44c9-a7ec-4f6cf2b41109
parent 2456877b
This diff is collapsed.
This diff is collapsed.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>rdiff-backup FAQ</title>
</head>
<body>
<h1>rdiff-backup FAQ</h1>
<h2>Table of contents</h2>
<ol><li><a href="#__future__">When I try to run rdiff-backup it says
"ImportError: No module named __future__" or "SyntaxError: invalid
syntax". What's happening?</a></li>
<li><a href="#verbosity">What do the different verbosity levels mean?</a></li>
<li><a href="#windows">Does rdiff-backup run under Windows?</a></li>
</ol>
<h2>FAQ</h2>
<ol>
<a name="__future__">
<li><strong>When I try to run rdiff-backup it says "ImportError: No
module named __future__" or "SyntaxError: invalid syntax". What's
happening?</strong>
<P>rdiff-backup versions 0.2.x require Python version 2.1 or later,
and versions 0.3.x require Python version 2.2 or later. If you don't
know what version of python you are running, type in "python -V" from
the shell. I'm sorry if this is inconvenient, but rdiff-backup uses
generators, iterators, nested scoping, and static/class methods
extensively, and these were only added in version 2.2.
<P>If you have two versions of python installed, and running "python"
defaults to an early version, you'll probably have to change the first
line of the rdiff-backup script. For instance, you could set it to:
<pre>
#/usr/bin/env python2.2
</pre>
</li>
<a name="verbosity">
<li><strong>What do the different verbosity levels mean?</strong>
<P>There is no formal specification, but here is a rough description
(settings are always cumulative, so 5 displays everything 4 does):
<P>
<table cellspacing="10">
<tr><td>0</td><td>No information given</td></tr>
<tr><td>1</td><td>Fatal Errors displayed</td></tr>
<tr><td>2</td><td>Warnings</td></tr>
<tr><td>3</td><td>Important messages, and maybe later some global statistics (default)</td></tr>
<tr><td>4</td><td>Some global settings, miscellaneous messages</td></tr>
<tr><td>5</td><td>Mentions which files were changed</td></tr>
<tr><td>6</td><td>More information on each file processed</td></tr>
<tr><td>7</td><td>More information on various things</td></tr>
<tr><td>8</td><td>All logging is dated</td></tr>
<tr><td>9</td><td>Details on which objects are moving across the connection</td></tr>
</table>
<a name="windows">
<li><strong>Does rdiff-backup run under Windows?</strong>
<P>Yes, apparently it is possible. First, follow Jason Piterak's
instructions:
<pre>
Subject: Cygwin rdiff-backup
From: Jason Piterak &lt;Jason_Piterak@c-i-s.com&gt;
Date: Mon, 4 Feb 2002 16:54:24 -0500 (13:54 PST)
To: rdiff-backup@keywest.Stanford.EDU
Hello all,
On a lark, I thought I would attempt to get rdiff-backup to work under
Windows98 under Cygwin. We have a number of NT/Win2K servers in the field
that I'd love to be backing up via rdiff-backup, and this was the start of
getting that working.
SUMMARY:
o You can get all the pieces for rdiff-backup working under Cygwin.
o The backup process works up to the point of writing any files with
timestamps.
... This is because the ':' character is reserved for Alternate Data
Stream (ADS) file designations under NTFS.
HOW TO GET IT WORKING (to a point, anyway):
o Install Cygwin
o Download the Python 2.2 update through the Cygwin installer and install.
o Download the librsync libraries from the usual place, but before
compiling...
o Cygwin does not use/provide glibc. Because of this, you have to repoint
some header files in the Makefile:
-- Make sure that you have /usr/include/inttypes.h
redirected to /usr/include/sys/types.h. Do this by:
create a file /usr/include/inttypes.h with the contents:
#include &lt;sys/types.h&gt;
o Put rdiff-backup in your PATH, as you normally would.
</pre>
Then, whenever you use rdiff-backup (or at least if you are backing up
to or restoring from a Windows system), use the <strong>--windows-time-format</strong>
switch, which will tell rdiff-backup not to put a colon (":") in a
filename (this option was added after Jason posted his message).
</ol>
<hr>
<a href="http://www.stanford.edu/~bescoto">Ben Escoto</a> <address><a href="mailto:bescoto@stanford.edu">&lt;bescoto@stanford.edu&gt;</a></address>
<!-- Created: Fri Sep 7 15:34:45 PDT 2001 -->
<!-- hhmts start -->
Last modified: Sat Mar 16 13:22:34 PST 2002
<!-- hhmts end -->
</body>
</html>
Thank you for trying rdiff-backup.
Remember that you must have Python 2.2 or later and librsync installed
(this means that "python" and "rdiff" should be in your path). To
download, see http://www.python.org and
http://sourceforge.net/projects/rproxy/ respectively.
For remote operation, rdiff-backup should be in installed and in the
PATH on remote system(s) (see man page for more information).
If you have the above installed, and it still doesn't work, contact
Ben Escoto <bescoto@stanford.edu>, or post to the mailing list (see
web page at http://www.stanford.edu/~bescoto/rdiff-backup for more
information).
Accept a list of files??
Security audit
hardlinks
Don't produce stack trace which looks like crash/include file name in
logging stats
#!/usr/bin/env python
import os, re, shutil, time
filelist = ["rdiff-backup", "CHANGELOG", "COPYING", "README", "FAQ.html"]
# Various details about the files must also be specified by the rpm
# spec template.
spec_template = "rdiff-backup.spec"
def GetVersion():
"""Return version string by reading in ./rdiff-backup"""
fp = open("rdiff-backup", "r")
match = re.search("Version (.*?) ", fp.read())
fp.close()
return match.group(1)
def CopyMan(destination, version):
"""Create updated man page at the specified location"""
fp = open(destination, "w")
date = time.strftime("%B %Y", time.localtime(time.time()))
version = "Version "+version
firstline = ('.TH RDIFF-BACKUP 1 "%s" "%s" "User Manuals"\n' %
(date, version))
fp.write(firstline)
infp = open("rdiff-backup.1", "r")
infp.readline()
fp.write(infp.read())
fp.close()
infp.close()
def MakeTar(version):
"""Create rdiff-backup tar file"""
tardir = "rdiff-backup-%s" % version
tarfile = "rdiff-backup-%s.tar.gz" % version
os.mkdir(tardir)
for file in filelist: shutil.copyfile(file, os.path.join(tardir, file))
os.chmod(os.path.join(tardir, "rdiff-backup"), 0755)
CopyMan(os.path.join(tardir, "rdiff-backup.1"), version)
os.system("tar -cvzf %s %s" % (tarfile, tardir))
shutil.rmtree(tardir)
return tarfile
def MakeSpecFile(version):
"""Create spec file using spec template"""
specfile = "rdiff-backup-%s-1.spec" % version
outfp = open(specfile, "w")
outfp.write("Version: %s\n" % version)
infp = open(spec_template, "r")
outfp.write(infp.read())
infp.close()
outfp.close()
return specfile
def Main():
assert not os.system("./Make")
version = GetVersion()
print "Processing version " + version
tarfile = MakeTar(version)
print "Made tar file " + tarfile
specfile = MakeSpecFile(version)
print "Made specfile " + specfile
if __name__ == "__main__": Main()
#!/usr/bin/env python
import os, sys, re
def GetVersion():
"""Return version string by reading in ./rdiff-backup"""
fp = open("rdiff-backup", "r")
match = re.search("Version (.*?) ", fp.read())
fp.close()
return match.group(1)
if len(sys.argv) == 1:
specfile = "rdiff-backup-%s-1.spec" % GetVersion()
print "Using specfile %s" % specfile
elif len(sys.argv) == 2:
specfile = sys.argv[1]
print "Using specfile %s" % specfile
else:
print ("%s takes zero or one argument, the name of the rpm spec "
"file" % sys.argv[0])
sys.exit(1)
base = ".".join(specfile.split(".")[:-1])
srcrpm = base+".src.rpm"
noarchrpm = base+".noarch.rpm"
tarfile = "-".join(base.split("-")[:-1]) + ".tar.gz"
os.system("install -o root -g root -m 644 %s /usr/src/redhat/SOURCES" %
tarfile)
os.system("rpm -ba --sign -vv --target noarch " + specfile)
#os.system("install -o ben -g ben -m 644 /usr/src/redhat/SRPMS/%s ." % srcrpm)
os.system("install -o ben -g ben -m 644 /usr/src/redhat/RPMS/noarch/%s ." %
noarchrpm)
#!/usr/bin/env python
import sys, os
def RunCommand(cmd):
print cmd
os.system(cmd)
if not sys.argv[1:]:
print 'Call with version number, as in "./makeweb 0.3.1"'
sys.exit(1)
version = sys.argv[1]
webprefix = "/home/ben/misc/html/mirror/rdiff-backup/"
RunCommand("cp *%s* %s" % (version, webprefix))
RunCommand("rman -f html -r '' rdiff-backup.1 > %srdiff-backup.1.html"
% webprefix)
RunCommand("cp FAQ.html CHANGELOG %s" % webprefix)
os.chdir(webprefix)
print "cd ", webprefix
RunCommand("rm latest latest.rpm latest.tar.gz")
RunCommand("ln -s *rpm latest.rpm")
RunCommand("ln -s *tar.gz latest.tar.gz")
Summary: A backup prog that combines mirroring with incremental backup
Name: rdiff-backup
Release: 1
URL: http://www.stanford.edu/~bescoto/rdiff-backup/
Source: %{name}-%{version}.tar.gz
Copyright: GPL
Group: Applications/Archiving
BuildRoot: %{_tmppath}/%{name}-root
requires: librsync, python >= 2.2
%description
rdiff-backup is a script, written in Python, that backs up one
directory to another and is intended to be run periodically (nightly
from cron for instance). The target directory ends up a copy of the
source directory, but extra reverse diffs are stored in the target
directory, so you can still recover files lost some time ago. The idea
is to combine the best features of a mirror and an incremental
backup. rdiff-backup can also operate in a bandwidth efficient manner
over a pipe, like rsync. Thus you can use rdiff-backup and ssh to
securely back a hard drive up to a remote location, and only the
differences from the previous backup will be transmitted.
%prep
%setup
%build
%install
rm -rf $RPM_BUILD_ROOT
mkdir -p $RPM_BUILD_ROOT/usr/bin
mkdir -p $RPM_BUILD_ROOT/usr/share/man/man1
install -m 755 rdiff-backup $RPM_BUILD_ROOT/usr/bin/rdiff-backup
install -m 644 rdiff-backup.1 $RPM_BUILD_ROOT/usr/share/man/man1/rdiff-backup.1
%clean
%files
%defattr(-,root,root)
/usr/bin/rdiff-backup
/usr/share/man/man1/rdiff-backup.1.gz
%doc CHANGELOG COPYING README FAQ.html
%changelog
* Sun Nov 4 2001 Ben Escoto <bescoto@stanford.edu>
- Initial RPM
#!/usr/bin/env python
from __future__ import generators
import sys, os, stat
def usage():
print "Usage: find2dirs dir1 dir2"
print
print "Given the name of two directories, list all the files in both, one"
print "per line, but don't repeat a file even if it is in both directories"
sys.exit(1)
def getlist(base, ext = ""):
"""Return iterator yielding filenames from directory"""
if ext: yield ext
else: yield "."
fullname = os.path.join(base, ext)
if stat.S_ISDIR(stat.S_IFMT(os.lstat(fullname)[stat.ST_MODE])):
for subfile in os.listdir(fullname):
for fn in getlist(base, os.path.join(ext, subfile)): yield fn
def main(dir1, dir2):
d = {}
for fn in getlist(dir1): d[fn] = 1
for fn in getlist(dir2): d[fn] = 1
for fn in d.keys(): print fn
if not len(sys.argv) == 3: usage()
else: main(sys.argv[1], sys.argv[2])
#!/usr/bin/env python
"""init_smallfiles.py
This program makes a number of files of the given size in the
specified directory.
"""
import os, stat, sys, math
if len(sys.argv) > 5 or len(sys.argv) < 4:
print "Usage: init_files [directory name] [file size] [file count] [base]"
print
print "Creates file_count files in directory_name of size file_size."
print "The created directory has a tree type structure where each level"
print "has at most base files or directories in it. Default is 50."
sys.exit(1)
dirname = sys.argv[1]
filesize = int(sys.argv[2])
filecount = int(sys.argv[3])
block_size = 16384
block = "." * block_size
block_change = "." * (filesize % block_size)
if len(sys.argv) == 4: base = 50
else: base = int(sys.argv[4])
def make_file(path):
"""Make the file at path"""
fp = open(path, "w")
for i in xrange(int(math.floor(filesize/block_size))): fp.write(block)
fp.write(block_change)
fp.close()
def find_sublevels(count):
"""Return number of sublevels required for count files"""
return int(math.ceil(math.log(count)/math.log(base)))
def make_dir(dir, count):
"""Make count files in the directory, making subdirectories if necessary"""
print "Making directory %s with %d files" % (dir, count)
os.mkdir(dir)
level = find_sublevels(count)
assert count <= pow(base, level)
if level == 1:
for i in range(count): make_file(os.path.join(dir, "file%d" %i))
else:
files_per_subdir = pow(base, level-1)
full_dirs = int(count/files_per_subdir)
assert full_dirs <= base
for i in range(full_dirs):
make_dir(os.path.join(dir, "subdir%d" % i), files_per_subdir)
change = count - full_dirs*files_per_subdir
assert change >= 0
if change > 0:
make_dir(os.path.join(dir, "subdir%d" % full_dirs), change)
def start(dir):
try: os.stat(dir)
except os.error: pass
else:
print "Directory %s already exists, exiting." % dir
sys.exit(1)
make_dir(dirname, filecount)
start(dirname)
#!/usr/bin/python
import sys, os
curdir = os.getcwd()
os.chdir("../src")
execfile("destructive_stepping.py")
os.chdir(curdir)
lc = Globals.local_connection
for filename in sys.argv[1:]:
#print "Deleting %s" % filename
rp = RPath(lc, filename)
if rp.lstat(): rp.delete()
#!/usr/bin/env python
"""remove-comments.py
Given a python program on standard input, spit one out on stdout that
should work the same, but has blank and comment lines removed.
"""
import sys, re
triple_regex = re.compile('"""')
def eattriple(initial_line_stripped):
"""Keep reading until end of doc string"""
assert initial_line_stripped.startswith('"""')
if triple_regex.search(initial_line_stripped[3:]): return
while 1:
line = sys.stdin.readline()
if not line or triple_regex.search(line): break
while 1:
line = sys.stdin.readline()
if not line: break
stripped = line.strip()
if not stripped: continue
if stripped[0] == "#": continue
if stripped.startswith('"""'):
eattriple(stripped)
continue
sys.stdout.write(line)
This diff is collapsed.
This diff is collapsed.
from __future__ import generators
execfile("rorpiter.py")
#######################################################################
#
# destructive-stepping - Deal with side effects from traversing trees
#
class DSRPath(RPath):
"""Destructive Stepping RPath
Sometimes when we traverse the directory tree, even when we just
want to read files, we have to change things, like the permissions
of a file or directory in order to read it, or the file's access
times. This class is like an RPath, but the permission and time
modifications are delayed, so that they can be done at the very
end when they won't be disturbed later.
"""
def __init__(self, *args):
self.perms_delayed = self.times_delayed = None
RPath.__init__(self, *args)
def __getstate__(self):
"""Return picklable state. See RPath __getstate__."""
assert self.conn is Globals.local_connection # Can't pickle a conn
pickle_dict = {}
for attrib in ['index', 'data', 'perms_delayed', 'times_delayed',
'newperms', 'newtimes', 'path', 'base']:
if self.__dict__.has_key(attrib):
pickle_dict[attrib] = self.__dict__[attrib]
return pickle_dict
def __setstate__(self, pickle_dict):
"""Set state from object produced by getstate"""
self.conn = Globals.local_connection
for attrib in pickle_dict.keys():
self.__dict__[attrib] = pickle_dict[attrib]
def delay_perm_writes(self):
"""Signal that permission writing should be delayed until the end"""
self.perms_delayed = 1
self.newperms = None
def delay_time_changes(self):
"""Signal that time changes should also be delayed until the end"""
self.times_delayed = 1
self.newtimes = None
def chmod(self, permissions):
"""Change permissions, delaying if self.perms_delayed is set"""
if self.perms_delayed:
self.newperms = 1
self.data['perms'] = permissions
else: RPath.chmod(self, permissions)
def chmod_bypass(self, permissions):
"""Change permissions without updating the data dictionary"""
self.conn.os.chmod(self.path, permissions)
self.perms_delayed = self.newperms = 1
def remember_times(self):
"""Mark times as changed so they can be restored later"""
self.times_delayed = self.newtimes = 1
def settime(self, accesstime, modtime):
"""Change times, delaying if self.times_delayed is set"""
if self.times_delayed:
self.newtimes = 1
self.data['atime'] = accesstime
self.data['mtime'] = modtime
else: RPath.settime(self, accesstime, modtime)
def settime_bypass(self, accesstime, modtime):
"""Change times without updating data dictionary"""
self.conn.os.utime(self.path, (accesstime, modtime))
def setmtime(self, modtime):
"""Change mtime, delaying if self.times_delayed is set"""
if self.times_delayed:
self.newtimes = 1
self.data['mtime'] = modtime
else: RPath.setmtime(self, modtime)
def setmtime_bypass(self, modtime):
"""Change mtime without updating data dictionary"""
self.conn.os.utime(self.path, (time.time(), modtime))
def restoretimes(self):
"""Write times in self.data back to file"""
RPath.settime(self, self.data['atime'], self.data['mtime'])
def restoreperms(self):
"""Write permissions in self.data back to file"""
RPath.chmod(self, self.data['perms'])
def write_changes(self):
"""Write saved up permission/time changes"""
if not self.lstat(): return # File has been deleted in meantime
if self.perms_delayed and self.newperms:
self.conn.os.chmod(self.path, self.getperms())
if self.times_delayed:
if self.data.has_key('atime'):
self.settime_bypass(self.getatime(), self.getmtime())
elif self.newtimes and self.data.has_key('mtime'):
self.setmtime_bypass(self.getmtime())
class DestructiveStepping:
"""Destructive stepping"""
def initialize(dsrpath, source):
"""Change permissions of dsrpath, possibly delay writes
Abort if we need to access something and can't. If the file
is on the source partition, just log warning and return true.
Return false if everything good to go.
"""
if not source or Globals.change_source_perms:
dsrpath.delay_perm_writes()
def warn(err):
Log("Received error '%s' when dealing with file %s, skipping..."
% (err, dsrpath.path), 1)
def abort():
Log.FatalError("Missing access to file %s - aborting." %
dsrpath.path)
def try_chmod(perms):
"""Try to change the perms. If fail, return error."""
try: dsrpath.chmod_bypass(perms)
except os.error, err: return err
return None
if dsrpath.isreg() and not dsrpath.readable():
if source:
if Globals.change_source_perms and dsrpath.isowner():
err = try_chmod(0400)
if err:
warn(err)
return 1
else:
warn("No read permissions")
return 1
elif not Globals.change_mirror_perms or try_chmod(0600): abort()
elif dsrpath.isdir():
if source and (not dsrpath.readable() or not dsrpath.executable()):
if Globals.change_source_perms and dsrpath.isowner():
err = try_chmod(0500)
if err:
warn(err)
return 1
else:
warn("No read or exec permissions")
return 1
elif not source and not dsrpath.hasfullperms():
if Globals.change_mirror_perms: try_chmod(0700)
# Permissions above; now try to preserve access times if necessary
if (source and (Globals.preserve_atime or
Globals.change_source_perms) or
not source):
# These are the circumstances under which we will have to
# touch up a file's times after we are done with it
dsrpath.remember_times()
return None
def Finalizer(initial_state = None):
"""Return a finalizer that can work on an iterator of dsrpaths
The reason we have to use an IterTreeReducer is that some files
should be updated immediately, but for directories we sometimes
need to update all the files in the directory before finally
coming back to it.
"""
return IterTreeReducer(lambda x: None, lambda x,y: None, None,
lambda dsrpath, x, y: dsrpath.write_changes(),
initial_state)
def isexcluded(dsrp, source):
"""Return true if given DSRPath is excluded/ignored
If source = 1, treat as source file, otherwise treat as
destination file.
"""
if Globals.exclude_device_files and dsrp.isdev(): return 1
if source: exclude_regexps = Globals.exclude_regexps
else: exclude_regexps = Globals.exclude_mirror_regexps
for regexp in exclude_regexps:
if regexp.match(dsrp.path):
Log("Excluding %s" % dsrp.path, 6)
return 1
return None
def Iterate_from(baserp, source, starting_index = None):
"""Iterate dsrps from baserp, skipping any matching exclude_regexps
includes only dsrps with indicies greater than starting_index
if starting_index is not None.
"""
def helper_starting_from(dsrpath):
"""Like helper, but only start iterating after starting_index"""
if dsrpath.index > starting_index:
# Past starting_index, revert to normal helper
for dsrp in helper(dsrpath): yield dsrp
elif dsrpath.index == starting_index[:len(dsrpath.index)]:
# May encounter starting index on this branch
if (not DestructiveStepping.isexcluded(dsrpath, source) and
not DestructiveStepping.initialize(dsrpath, source)):
if dsrpath.isdir():
dir_listing = dsrpath.listdir()
dir_listing.sort()
for filename in dir_listing:
for dsrp in helper_starting_from(
dsrpath.append(filename)):
yield dsrp
def helper(dsrpath):
if (not DestructiveStepping.isexcluded(dsrpath, source) and
not DestructiveStepping.initialize(dsrpath, source)):
yield dsrpath
if dsrpath.isdir():
dir_listing = dsrpath.listdir()
dir_listing.sort()
for filename in dir_listing:
for dsrp in helper(dsrpath.append(filename)):
yield dsrp
base_dsrpath = DSRPath(baserp.conn, baserp.base,
baserp.index, baserp.data)
if starting_index is None: return helper(base_dsrpath)
else: return helper_starting_from(base_dsrpath)
def Iterate_with_Finalizer(baserp, source):
"""Like Iterate_from, but finalize each dsrp afterwards"""
finalize = DestructiveStepping.Finalizer()
for dsrp in DestructiveStepping.Iterate_from(baserp, source):
yield dsrp
finalize(dsrp)
finalize.getresult()
MakeStatic(DestructiveStepping)
from __future__ import generators
execfile("manage.py")
#######################################################################
#
# filelist - Some routines that help with operations over files listed
# in standard input instead of over whole directories.
#
class FilelistError(Exception): pass
class Filelist:
"""Many of these methods have analogs in highlevel.py"""
def File2Iter(fp, baserp):
"""Convert file obj with one pathname per line into rpiter
Closes fp when done. Given files are added to baserp.
"""
while 1:
line = fp.readline()
if not line: break
if line[-1] == "\n": line = line[:-1] # strip trailing newline
if not line: continue # skip blank lines
elif line[0] == "/": raise FilelistError(
"Read in absolute file name %s." % line)
yield baserp.append(line)
assert not fp.close(), "Error closing filelist fp"
def Mirror(src_rpath, dest_rpath, rpiter):
"""Copy files in fileiter from src_rpath to dest_rpath"""
sigiter = dest_rpath.conn.Filelist.get_sigs(dest_rpath, rpiter)
diffiter = Filelist.get_diffs(src_rpath, sigiter)
dest_rpath.conn.Filelist.patch(dest_rpath, diffiter)
dest_rpath.setdata()
def Mirror_and_increment(src_rpath, dest_rpath, inc_rpath):
"""Mirror + put increment in tree based at inc_rpath"""
sigiter = dest_rpath.conn.Filelist.get_sigs(dest_rpath, rpiter)
diffiter = Filelist.get_diffs(src_rpath, sigiter)
dest_rpath.conn.Filelist.patch_and_increment(dest_rpath, diffiter,
inc_rpath)
dest_rpath.setdata()
def get_sigs(dest_rpbase, rpiter):
"""Get signatures of file analogs in rpiter
This is meant to be run on the destination side. Only the
extention part of the rps in rpiter will be used; the base is
ignored.
"""
def dest_iter(src_iter):
for src_rp in src_iter: yield dest_rpbase.new_index(src_rp.index)
return RORPIter.Signatures(dest_iter())
def get_diffs(src_rpbase, sigiter):
"""Get diffs based on sigiter and files in src_rpbase
This should be run on the local side.
"""
for sig_rorp in sigiter:
new_rp = src_rpbase.new_index(sig_rorp.index)
yield RORPIter.diffonce(sig_rorp, new_rp)
def patch(dest_rpbase, diffiter):
"""Process diffs in diffiter and update files in dest_rbpase.
Run remotely.
"""
for diff_rorp in diffiter:
basisrp = dest_rpbase.new_index(diff_rorp.index)
if basisrp.lstat(): Filelist.make_subdirs(basisrp)
Log("Processing %s" % basisrp.path, 7)
RORPIter.patchonce(dest_rpbase, basisrp, diff_rorp)
def patch_and_increment(dest_rpbase, diffiter, inc_rpbase):
"""Apply diffs in diffiter to dest_rpbase, and increment to inc_rpbase
Also to be run remotely.
"""
for diff_rorp in diffiter:
basisrp = dest_rpbase.new_index(diff_rorp.index)
if diff_rorp.lstat(): Filelist.make_subdirs(basisrp)
Log("Processing %s" % basisrp.path, 7)
# XXX This isn't done yet...
def make_subdirs(rpath):
"""Make sure that all the directories under the rpath exist
This function doesn't try to get the permissions right on the
underlying directories, just do the minimum to make sure the
file can be created.
"""
dirname = rpath.dirsplit()[0]
if dirname == '.' or dirname == '': return
dir_rp = RPath(rpath.conn, dirname)
Filelist.make_subdirs(dir_rp)
if not dir_rp.lstat(): dir_rp.mkdir()
MakeStatic(Filelist)
#!/usr/bin/env python
#
# rdiff-backup -- Mirror files while keeping incremental changes
# Version 0.6.0 released March 14, 2002
# Copyright (C) 2001 Ben Escoto <bescoto@stanford.edu>
#
# This program is licensed under the GNU General Public License (GPL).
# Distributions of rdiff-backup usually include a copy of the GPL in a
# file called COPYING. The GPL is also available online at
# http://www.gnu.org/copyleft/gpl.html.
#
# Please send mail to me or the mailing list if you find bugs or have
# any suggestions.
from __future__ import nested_scopes, generators
import os, stat, time, sys, getopt, re, cPickle, types, shutil, sha, marshal, traceback, popen2, tempfile
This diff is collapsed.
execfile("destructive_stepping.py")
#######################################################################
#
# increment - Provides Inc class, which writes increment files
#
# This code is what writes files ending in .diff, .snapshot, etc.
#
class Inc:
"""Class containing increment functions"""
def Increment_action(new, mirror, incpref):
"""Main file incrementing function, returns RobustAction
new is the file on the active partition,
mirror is the mirrored file from the last backup,
incpref is the prefix of the increment file.
This function basically moves mirror -> incpref.
"""
if not (new and new.lstat() or mirror.lstat()):
return Robust.null_action # Files deleted in meantime, do nothing
Log("Incrementing mirror file " + mirror.path, 5)
if ((new and new.isdir()) or mirror.isdir()) and not incpref.isdir():
incpref.mkdir()
if not mirror.lstat(): return Inc.makemissing_action(incpref)
elif mirror.isdir(): return Inc.makedir_action(mirror, incpref)
elif new.isreg() and mirror.isreg():
return Inc.makediff_action(new, mirror, incpref)
else: return Inc.makesnapshot_action(mirror, incpref)
def Increment(new, mirror, incpref):
Inc.Increment_action(new, mirror, incpref).execute()
def makemissing_action(incpref):
"""Signify that mirror file was missing"""
return RobustAction(lambda: None,
Inc.get_inc_ext(incpref, "missing").touch,
lambda exp: None)
def makesnapshot_action(mirror, incpref):
"""Copy mirror to incfile, since new is quite different"""
snapshotrp = Inc.get_inc_ext(incpref, "snapshot")
return Robust.copy_with_attribs_action(mirror, snapshotrp)
def makediff_action(new, mirror, incpref):
"""Make incfile which is a diff new -> mirror"""
diff = Inc.get_inc_ext(incpref, "diff")
return Robust.chain([Rdiff.write_delta_action(new, mirror, diff),
Robust.copy_attribs_action(mirror, diff)])
def makedir_action(mirrordir, incpref):
"""Make file indicating directory mirrordir has changed"""
dirsign = Inc.get_inc_ext(incpref, "dir")
def final():
dirsign.touch()
RPath.copy_attribs(mirrordir, dirsign)
return RobustAction(lambda: None, final, dirsign.delete)
def get_inc_ext(rp, typestr):
"""Return RPath/DSRPath like rp but with inc/time extension
If the file exists, then probably a previous backup has been
aborted. We then keep asking FindTime to get a time later
than the one that already has an inc file.
"""
def get_newinc(timestr):
"""Get new increment rp with given time suffix"""
addtostr = lambda s: "%s.%s.%s" % (s, timestr, typestr)
if rp.index:
return rp.__class__(rp.conn, rp.base, rp.index[:-1] +
(addtostr(rp.index[-1]),))
else: return rp.__class__(rp.conn, addtostr(rp.base), rp.index)
inctime = 0
while 1:
inctime = Resume.FindTime(rp.index, inctime)
incrp = get_newinc(Time.timetostring(inctime))
if not incrp.lstat(): return incrp
def make_patch_increment_ITR(inc_rpath, initial_state = None):
"""Return IterTreeReducer that patches and increments
This has to be an ITR because directories that have files in
them changed are flagged with an increment marker. There are
four possibilities as to the order:
1. Normal file -> Normal file: right away
2. Directory -> Directory: wait until files in the directory
are processed, as we won't know whether to add a marker
until the end.
3. Normal file -> Directory: right away, so later files will
have a directory to go into.
4. Directory -> Normal file: Wait until the end, so we can
process all the files in the directory.
"""
def base_init(indexed_tuple):
"""Patch if appropriate, return (a,b) tuple
a is true if found directory and thus didn't take action
if a is false, b is true if some changes were made
if a is true, b is the rp of a temporary file used to hold
the diff_rorp's data (for dir -> normal file change), and
false if none was necessary.
"""
diff_rorp, dsrp = indexed_tuple
incpref = inc_rpath.new_index(indexed_tuple.index)
if dsrp.isdir(): return init_dir(dsrp, diff_rorp, incpref)
else: return init_non_dir(dsrp, diff_rorp, incpref)
def init_dir(dsrp, diff_rorp, incpref):
"""Initial processing of a directory
Make the corresponding directory right away, but wait
until the end to write the replacement. However, if the
diff_rorp contains data, we must write it locally before
continuing, or else that data will be lost in the stream.
"""
if not (incpref.lstat() and incpref.isdir()): incpref.mkdir()
if diff_rorp and diff_rorp.isreg() and diff_rorp.file:
tf = TempFileManager.new(dsrp)
RPathStatic.copy_with_attribs(diff_rorp, tf)
tf.set_attached_filetype(diff_rorp.get_attached_filetype())
return (1, tf)
else: return (1, None)
def init_non_dir(dsrp, diff_rorp, incpref):
"""Initial processing of non-directory
If a reverse diff is called for it is generated by apply
the forwards diff first on a temporary file.
"""
if diff_rorp:
if dsrp.isreg() and diff_rorp.isreg():
tf = TempFileManager.new(dsrp)
def init_thunk():
Rdiff.patch_with_attribs_action(dsrp, diff_rorp,
tf).execute()
Inc.Increment_action(tf, dsrp, incpref).execute()
Robust.make_tf_robustaction(init_thunk, (tf,),
(dsrp,)).execute()
else:
Robust.chain([Inc.Increment_action(diff_rorp, dsrp,
incpref),
RORPIter.patchonce_action(
None, dsrp, diff_rorp)]).execute()
return (None, 1)
return (None, None)
def base_final(base_tuple, base_init_tuple, changed):
"""Patch directory if not done, return true iff made change"""
if base_init_tuple[0]: # was directory
diff_rorp, dsrp = base_tuple
if changed or diff_rorp:
if base_init_tuple[1]: diff_rorp = base_init_tuple[1]
Inc.Increment(diff_rorp, dsrp,
inc_rpath.new_index(base_tuple.index))
if diff_rorp:
RORPIter.patchonce_action(None, dsrp,
diff_rorp).execute()
if isinstance(diff_rorp, TempFile): diff_rorp.delete()
return 1
return None
else: # changed iff base_init_tuple says it was
return base_init_tuple[1]
return IterTreeReducer(base_init, lambda x,y: x or y, None,
base_final, initial_state)
MakeStatic(Inc)
execfile("ttime.py")
import cPickle
#######################################################################
#
# iterfile - Convert an iterator to a file object and vice-versa
#
class IterFileException(Exception): pass
class UnwrapFile:
"""Contains some basic methods for parsing a file containing an iter"""
def __init__(self, file):
self.file = file
def _s2l(self, s):
"""Convert string to long int"""
assert len(s) == 7
l = 0L
for i in range(7): l = l*256 + ord(s[i])
return l
def _get(self):
"""Return pair (type, data) next in line on the file
type is a single character which is either "o" for object, "f"
for file, "c" for a continution of a file, or None if no more
data can be read. Data is either the file's data, if type is
"c" or "f", or the actual object if the type is "o".
"""
header = self.file.read(8)
if not header: return None, None
assert len(header) == 8, "Header is only %d bytes" % len(header)
type, length = header[0], self._s2l(header[1:])
buf = self.file.read(length)
if type == "o": return type, cPickle.loads(buf)
else: return type, buf
class IterWrappingFile(UnwrapFile):
"""An iterator generated from a file.
Initialize with a file type object, and then it will return the
elements of the file in order.
"""
def __init__(self, file):
UnwrapFile.__init__(self, file)
self.currently_in_file = None
def __iter__(self): return self
def next(self):
if self.currently_in_file:
self.currently_in_file.close() # no error checking by this point
type, data = self._get()
if not type: raise StopIteration
if type == "o": return data
elif type == "f":
file = IterVirtualFile(self, data)
if data: self.currently_in_file = file
else: self.currently_in_file = None
return file
else: raise IterFileException("Bad file type %s" % type)
class IterVirtualFile(UnwrapFile):
"""Another version of a pretend file
This is returned by IterWrappingFile when a file is embedded in
the main file that the IterWrappingFile is based around.
"""
def __init__(self, iwf, initial_data):
"""Initializer
initial_data is the data from the first block of the file.
iwf is the iter wrapping file that spawned this
IterVirtualFile.
"""
UnwrapFile.__init__(self, iwf.file)
self.iwf = iwf
self.bufferlist = [initial_data]
self.bufferlen = len(initial_data)
self.closed = None
def check_consistency(self):
l = len("".join(self.bufferlist))
assert l == self.bufferlen, \
"Length of IVF bufferlist doesn't match (%s, %s)" % \
(l, self.bufferlen)
def read(self, length):
assert not self.closed
if self.iwf.currently_in_file:
while length >= self.bufferlen:
if not self.addtobuffer(): break
real_len = min(length, self.bufferlen)
combined_buffer = "".join(self.bufferlist)
assert len(combined_buffer) == self.bufferlen, \
(len(combined_buffer), self.bufferlen)
self.bufferlist = [combined_buffer[real_len:]]
self.bufferlen = self.bufferlen - real_len
return combined_buffer[:real_len]
def addtobuffer(self):
"""Read a chunk from the file and add it to the buffer"""
assert self.iwf.currently_in_file
type, data = self._get()
assert type == "c", "Type is %s instead of c" % type
if data:
self.bufferlen = self.bufferlen + len(data)
self.bufferlist.append(data)
return 1
else:
self.iwf.currently_in_file = None
return None
def close(self):
"""Currently just reads whats left and discards it"""
while self.iwf.currently_in_file:
self.addtobuffer()
self.bufferlist = []
self.bufferlen = 0
self.closed = 1
class FileWrappingIter:
"""A file interface wrapping around an iterator
This is initialized with an iterator, and then converts it into a
stream of characters. The object will evaluate as little of the
iterator as is necessary to provide the requested bytes.
The actual file is a sequence of marshaled objects, each preceded
by 8 bytes which identifies the following the type of object, and
specifies its length. File objects are not marshalled, but the
data is written in chunks of Globals.blocksize, and the following
blocks can identify themselves as continuations.
"""
def __init__(self, iter):
"""Initialize with iter"""
self.iter = iter
self.bufferlist = []
self.bufferlen = 0L
self.currently_in_file = None
self.closed = None
def read(self, length):
"""Return next length bytes in file"""
assert not self.closed
while self.bufferlen < length:
if not self.addtobuffer(): break
combined_buffer = "".join(self.bufferlist)
assert len(combined_buffer) == self.bufferlen
real_len = min(self.bufferlen, length)
self.bufferlen = self.bufferlen - real_len
self.bufferlist = [combined_buffer[real_len:]]
return combined_buffer[:real_len]
def addtobuffer(self):
"""Updates self.bufferlist and self.bufferlen, adding on a chunk
Returns None if we have reached the end of the iterator,
otherwise return true.
"""
if self.currently_in_file:
buf = "c" + self.addfromfile()
else:
try: currentobj = self.iter.next()
except StopIteration: return None
if hasattr(currentobj, "read") and hasattr(currentobj, "close"):
self.currently_in_file = currentobj
buf = "f" + self.addfromfile()
else:
pickle = cPickle.dumps(currentobj, 1)
buf = "o" + self._l2s(len(pickle)) + pickle
self.bufferlist.append(buf)
self.bufferlen = self.bufferlen + len(buf)
return 1
def addfromfile(self):
"""Read a chunk from the current file and return it"""
buf = self.currently_in_file.read(Globals.blocksize)
if not buf:
assert not self.currently_in_file.close()
self.currently_in_file = None
return self._l2s(len(buf)) + buf
def _l2s(self, l):
"""Convert long int to string of 7 characters"""
s = ""
for i in range(7):
l, remainder = divmod(l, 256)
s = chr(remainder) + s
assert remainder == 0
return s
def close(self): self.closed = 1
class BufferedRead:
"""Buffer the .read() calls to the given file
This is used to lessen overhead and latency when a file is sent
over a connection.
"""
def __init__(self, file):
self.file = file
self.buffer = ""
self.bufsize = Globals.conn_bufsize
def read(self, l = -1):
if l < 0: # Read as much as possible
result = self.buffer + self.file.read()
self.buffer = ""
return result
if len(self.buffer) < l: # Try to make buffer as long as l
self.buffer += self.file.read(max(self.bufsize,
l - len(self.buffer)))
actual_size = min(l, len(self.buffer))
result = self.buffer[:actual_size]
self.buffer = self.buffer[actual_size:]
return result
def close(self): return self.file.close()
This diff is collapsed.
import time, sys
execfile("lazy.py")
#######################################################################
#
# log - Manage logging
#
class LoggerError(Exception): pass
class Logger:
"""All functions which deal with logging"""
def __init__(self):
self.log_file_open = None
self.log_file_local = None
self.verbosity = self.term_verbosity = 3
# termverbset is true if the term_verbosity has been explicity set
self.termverbset = None
def setverbosity(self, verbosity_string):
"""Set verbosity levels. Takes a number string"""
try: self.verbosity = int(verbosity_string)
except ValueError:
Log.FatalError("Verbosity must be a number, received '%s' "
"instead." % verbosity_string)
if not self.termverbset: self.term_verbosity = self.verbosity
def setterm_verbosity(self, termverb_string):
"""Set verbosity to terminal. Takes a number string"""
try: self.term_verbosity = int(termverb_string)
except ValueError:
Log.FatalError("Terminal verbosity must be a number, received "
"'%s' insteaxd." % termverb_string)
self.termverbset = 1
def open_logfile(self, rpath):
"""Inform all connections of an open logfile.
rpath.conn will write to the file, and the others will pass
write commands off to it.
"""
for conn in Globals.connections:
conn.Log.open_logfile_allconn(rpath.conn)
rpath.conn.Log.open_logfile_local(rpath)
def open_logfile_allconn(self, log_file_conn):
"""Run on all connections to signal log file is open"""
self.log_file_open = 1
self.log_file_conn = log_file_conn
def open_logfile_local(self, rpath):
"""Open logfile locally - should only be run on one connection"""
assert self.log_file_conn is Globals.local_connection
self.log_file_local = 1
self.logrp = rpath
self.logfp = rpath.open("a")
def close_logfile(self):
"""Close logfile and inform all connections"""
if self.log_file_open:
for conn in Globals.connections:
conn.Log.close_logfile_allconn()
self.log_file_conn.Log.close_logfile_local()
def close_logfile_allconn(self):
"""Run on every connection"""
self.log_file_open = None
def close_logfile_local(self):
"""Run by logging connection - close logfile"""
assert self.log_file_conn is Globals.local_connection
assert not self.logfp.close()
def format(self, message, verbosity):
"""Format the message, possibly adding date information"""
if verbosity < 9: return message + "\n"
else: return "%s %s\n" % (time.asctime(time.localtime(time.time())),
message)
def __call__(self, message, verbosity):
"""Log message that has verbosity importance"""
if verbosity <= self.verbosity: self.log_to_file(message)
if verbosity <= self.term_verbosity:
self.log_to_term(message, verbosity)
def log_to_file(self, message):
"""Write the message to the log file, if possible"""
if self.log_file_open:
if self.log_file_local:
self.logfp.write(self.format(message, self.verbosity))
else: self.log_file_conn.Log.log_to_file(message)
def log_to_term(self, message, verbosity):
"""Write message to stdout/stderr"""
if verbosity <= 2 or Globals.server: termfp = sys.stderr
else: termfp = sys.stdout
termfp.write(self.format(message, self.term_verbosity))
def conn(self, direction, result, req_num):
"""Log some data on the connection
The main worry with this function is that something in here
will create more network traffic, which will spiral to
infinite regress. So, for instance, logging must only be done
to the terminal, because otherwise the log file may be remote.
"""
if self.term_verbosity < 9: return
if type(result) is types.StringType: result_repr = repr(result)
else: result_repr = str(result)
if Globals.server: conn_str = "Server"
else: conn_str = "Client"
self.log_to_term("%s %s (%d): %s" %
(conn_str, direction, req_num, result_repr), 9)
def FatalError(self, message):
self("Fatal Error: " + message, 1)
Globals.Main.cleanup()
sys.exit(1)
def exception(self, only_terminal = 0):
"""Log an exception and traceback at verbosity 2
If only_terminal is None, log normally. If it is 1, then only
log to disk if log file is local (self.log_file_open = 1). If
it is 2, don't log to disk at all.
"""
assert only_terminal in (0, 1, 2)
if (only_terminal == 0 or
(only_terminal == 1 and self.log_file_open)):
logging_func = self.__call__
else: logging_func = self.log_to_term
exc_info = sys.exc_info()
logging_func("Exception %s raised of class %s" %
(exc_info[1], exc_info[0]), 2)
logging_func("".join(traceback.format_tb(exc_info[2])), 2)
Log = Logger()
execfile("restore.py")
#######################################################################
#
# manage - list, delete, and otherwise manage increments
#
class ManageException(Exception): pass
class Manage:
def get_incobjs(datadir):
"""Return Increments objects given the rdiff-backup data directory"""
return map(IncObj, Manage.find_incrps_with_base(datadir, "increments"))
def find_incrps_with_base(dir_rp, basename):
"""Return list of incfiles with given basename in dir_rp"""
rps = map(dir_rp.append, dir_rp.listdir())
incrps = filter(RPath.isincfile, rps)
result = filter(lambda rp: rp.getincbase_str() == basename, incrps)
Log("find_incrps_with_base: found %d incs" % len(result), 6)
return result
def describe_root_incs(datadir):
"""Return a string describing all the the root increments"""
result = []
currentrps = Manage.find_incrps_with_base(datadir, "current_mirror")
if not currentrps:
Log("Warning: no current mirror marker found", 1)
elif len(currentrps) > 1:
Log("Warning: multiple mirror markers found", 1)
for rp in currentrps:
result.append("Found mirror marker %s" % rp.path)
result.append("Indicating latest mirror taken at %s" %
Time.stringtopretty(rp.getinctime()))
result.append("---------------------------------------------"
"-------------")
# Sort so they are in reverse order by time
time_w_incobjs = map(lambda io: (-io.time, io),
Manage.get_incobjs(datadir))
time_w_incobjs.sort()
incobjs = map(lambda x: x[1], time_w_incobjs)
result.append("Found %d increments:" % len(incobjs))
result.append("\n------------------------------------------\n".join(
map(IncObj.full_description, incobjs)))
return "\n".join(result)
def delete_earlier_than(baserp, time):
"""Deleting increments older than time in directory baserp
time is in seconds. It will then delete any empty directories
in the tree. To process the entire backup area, the
rdiff-backup-data directory should be the root of the tree.
"""
def yield_files(rp):
yield rp
if rp.isdir():
for filename in rp.listdir():
for sub_rp in yield_files(rp.append(filename)):
yield sub_rp
for rp in yield_files(baserp):
if ((rp.isincfile() and
Time.stringtotime(rp.getinctime()) < time) or
(rp.isdir() and not rp.listdir())):
Log("Deleting increment file %s" % rp.path, 5)
rp.delete()
MakeStatic(Manage)
class IncObj:
"""Increment object - represent a completed increment"""
def __init__(self, incrp):
"""IncObj initializer
incrp is an RPath of a path like increments.TIMESTR.dir
standing for the root of the increment.
"""
if not incrp.isincfile():
raise ManageException("%s is not an inc file" % incrp.path)
self.incrp = incrp
self.time = Time.stringtotime(incrp.getinctime())
def getbaserp(self):
"""Return rp of the incrp without extensions"""
return self.incrp.getincbase()
def pretty_time(self):
"""Return a formatted version of inc's time"""
return Time.timetopretty(self.time)
def full_description(self):
"""Return string describing increment"""
s = ["Increment file %s" % self.incrp.path,
"Date: %s" % self.pretty_time()]
return "\n".join(s)
from __future__ import generators
execfile("increment.py")
import tempfile
#######################################################################
#
# restore - Read increment files and restore to original
#
class RestoreError(Exception): pass
class Restore:
def RestoreFile(rest_time, rpbase, inclist, rptarget):
"""Non-recursive restore function
rest_time is the time in seconds to restore to,
rpbase is the base name of the file being restored,
inclist is a list of rpaths containing all the relevant increments,
and rptarget is the rpath that will be written with the restored file.
"""
inclist = Restore.sortincseq(rest_time, inclist)
if not inclist and not (rpbase and rpbase.lstat()):
return # no increments were applicable
Log("Restoring %s with increments %s to %s" %
(rpbase and rpbase.path,
Restore.inclist2str(inclist), rptarget.path), 5)
if not inclist or inclist[0].getinctype() == "diff":
assert rpbase and rpbase.lstat(), \
"No base to go with incs %s" % Restore.inclist2str(inclist)
RPath.copy_with_attribs(rpbase, rptarget)
for inc in inclist: Restore.applyinc(inc, rptarget)
def inclist2str(inclist):
"""Return string version of inclist for logging"""
return ",".join(map(lambda x: x.path, inclist))
def sortincseq(rest_time, inclist):
"""Sort the inc sequence, and throw away irrelevant increments"""
incpairs = map(lambda rp: (Time.stringtotime(rp.getinctime()), rp),
inclist)
# Only consider increments at or after the time being restored
incpairs = filter(lambda pair: pair[0] >= rest_time, incpairs)
# Now throw away older unnecessary increments
incpairs.sort()
i = 0
while(i < len(incpairs)):
# Only diff type increments require later versions
if incpairs[i][1].getinctype() != "diff": break
i = i+1
incpairs = incpairs[:i+1]
# Return increments in reversed order
incpairs.reverse()
return map(lambda pair: pair[1], incpairs)
def applyinc(inc, target):
"""Apply increment rp inc to targetrp target"""
Log("Applying increment %s to %s" % (inc.path, target.path), 6)
inctype = inc.getinctype()
if inctype == "diff":
if not target.lstat():
raise RestoreError("Bad increment sequence at " + inc.path)
Rdiff.patch_action(target, inc).execute()
elif inctype == "dir":
if not target.isdir():
if target.lstat():
raise RestoreError("File %s already exists" % target.path)
target.mkdir()
elif inctype == "missing": return
elif inctype == "snapshot": RPath.copy(inc, target)
else: raise RestoreError("Unknown inctype %s" % inctype)
RPath.copy_attribs(inc, target)
def RestoreRecursive(rest_time, mirror_base, baseinc_tup, target_base):
"""Recursive restore function.
rest_time is the time in seconds to restore to;
mirror_base is an rpath of the mirror directory corresponding
to the one to be restored;
baseinc_tup is the inc tuple (incdir, list of incs) to be
restored;
and target_base in the dsrp of the target directory.
"""
assert isinstance(target_base, DSRPath)
collated = RORPIter.CollateIterators(
DestructiveStepping.Iterate_from(mirror_base, None),
Restore.yield_inc_tuples(baseinc_tup))
mirror_finalizer = DestructiveStepping.Finalizer()
target_finalizer = DestructiveStepping.Finalizer()
for mirror, inc_tup in collated:
if not inc_tup:
inclist = []
target = target_base.new_index(mirror.index)
else:
inclist = inc_tup[1]
target = target_base.new_index(inc_tup.index)
DestructiveStepping.initialize(target, None)
Restore.RestoreFile(rest_time, mirror, inclist, target)
target_finalizer(target)
if mirror: mirror_finalizer(mirror)
target_finalizer.getresult()
mirror_finalizer.getresult()
def yield_inc_tuples(inc_tuple):
"""Iterate increment tuples starting with inc_tuple
An increment tuple is an IndexedTuple (pair). The first will
be the rpath of a directory, and the second is a list of all
the increments associated with that directory. If there are
increments that do not correspond to a directory, the first
element will be None. All the rpaths involved correspond to
files in the increment directory.
"""
oldindex, rpath = inc_tuple.index, inc_tuple[0]
yield inc_tuple
if not rpath or not rpath.isdir(): return
inc_list_dict = {} # Index tuple lists by index
dirlist = rpath.listdir()
def affirm_dict_indexed(index):
"""Make sure the inc_list_dict has given index"""
if not inc_list_dict.has_key(index):
inc_list_dict[index] = [None, []]
def add_to_dict(filename):
"""Add filename to the inc tuple dictionary"""
rp = rpath.append(filename)
if rp.isincfile():
basename = rp.getincbase_str()
affirm_dict_indexed(basename)
inc_list_dict[basename][1].append(rp)
elif rp.isdir():
affirm_dict_indexed(filename)
inc_list_dict[filename][0] = rp
def list2tuple(index):
"""Return inc_tuple version of dictionary entry by index"""
inclist = inc_list_dict[index]
if not inclist[1]: return None # no increments, so ignore
return IndexedTuple(oldindex + (index,), inclist)
for filename in dirlist: add_to_dict(filename)
keys = inc_list_dict.keys()
keys.sort()
for index in keys:
new_inc_tuple = list2tuple(index)
if not new_inc_tuple: continue
elif new_inc_tuple[0]: # corresponds to directory
for i in Restore.yield_inc_tuples(new_inc_tuple): yield i
else: yield new_inc_tuple
MakeStatic(Restore)
from __future__ import generators
import marshal, sha, types
execfile("iterfile.py")
#######################################################################
#
# rlist - Define the CachingIter, and sig/diff/patch ops on iterators
#
class CachingIter:
"""Cache parts of an iter using a list
Turn an iter into something that you can prepend elements into,
and also read from without apparently changing the state.
"""
def __init__(self, iter_or_list):
if type(iter_or_list) is types.ListType:
self.iter = iter(iter_or_list)
else: self.iter = iter_or_list
self.next = self.iter.next
self.head = []
def __iter__(self): return self
def _next(self):
"""Take elements from the head list
When there are elements waiting before the main iterator, this
is the next function. If not, iter.next returns to being next.
"""
head = self.head
a = head[0]
del head[0]
if not head: self.next = self.iter.next
return a
def nextrange(self, m):
"""Return next m elements in list"""
l = head[:m]
del head[:m]
for i in xrange(m - len(l)): l.append(self.iter.next())
return l
def peek(self):
"""Return next element without removing it from iterator"""
n = self.next()
self.push(n)
return n
def push(self, elem):
"""Insert an element into the iterator at the beginning"""
if not self.head: self.next = self._next
self.head.insert(0, elem)
def pushrange(self, elem_list):
"""Insert list of multiple elements at the beginning"""
if not self.head: self.next = self._next
self.head[:0] = elem_list
def cache(self, m):
"""Move next m elements from iter to internal list
If m is None, append the entire rest of the iterator.
"""
h, it = self.head, self.iter
if m is None:
for i in it: h.append(i)
else:
for i in xrange(m): h.append(it.next())
def __getitem__(self, key):
"""Support a[i:j] style notation. Non destructive"""
if type(key) is types.SliceType:
if key.stop > len(self.head): self.cache(key.stop - len(self.head))
return self.head[key.start, key.stop]
else:
if key >= len(self.head): self.cache(key + 1 - len(self.head))
return self.head[key]
class RListDelta:
"""Note a difference from one iterator (A) to another (B)
The min, max pairs are indicies which stand for the half-open
interval (min, max], and elemlist is a list of all the elements in
A which fall within this interval.
These are produced by the function RList.Deltas(...)
"""
def __init__(self, (min, max), elemlist):
self.min, self.max = min, max
self.elemlist = elemlist
class RList:
"""Tools for signatures, diffing, and patching an iterator
This class requires that the iterators involved are yielding
objects that have .index and .data attributes. Two objects with
the same .data attribute are supposed to be equivalent. The
iterator must also yield the objects in increasing order with
respect to the .index attribute.
"""
blocksize = 100
def Signatures(iter):
"""Return iterator of signatures from stream of pairs
Each signature is an ordered pair (last index sig applies to,
SHA digest of data)
"""
i, s = 0, sha.new()
for iter_elem in iter:
s.update(marshal.dumps(iter_elem.data))
i = i+1
if i == RList.blocksize:
yield (iter_elem.index, s.digest())
i, s = 0, sha.new()
if i != 0: yield (iter_elem.index, s.digest())
def sig_one_block(iter_or_list):
"""Return the digest portion of a signature on given list"""
s = sha.new()
for iter_elem in iter_or_list: s.update(marshal.dumps(iter_elem.data))
return s.digest()
def Deltas(remote_sigs, iter):
"""Return iterator of Delta objects that bring iter to remote"""
def get_before(index, iter):
"""Return elements in iter whose index is before or equal index
iter needs to be pushable
"""
l = []
while 1:
try: iter_elem = iter.next()
except StopIteration: return l
if iter_elem.index > index: break
l.append(iter_elem)
iter.push(iter_elem)
return l
if not isinstance(iter, CachingIter): iter = CachingIter(iter)
oldindex = None
for (rs_index, rs_digest) in remote_sigs:
l = get_before(rs_index, iter)
if rs_digest != RList.sig_one_block(l):
yield RListDelta((oldindex, rs_index), l)
oldindex = rs_index
def patch_once(basis, delta):
"""Apply one delta to basis to return original iterator
This returns original iterator up to and including the max range
of delta, then stop. basis should be pushable.
"""
# Return elements of basis until start of delta range
for basis_elem in basis:
if basis_elem.index > delta.min:
basis.push(basis_elem)
break
yield basis_elem
# Yield elements of delta...
for elem in delta.elemlist: yield elem
# Finally, discard basis until end of delta range
for basis_elem in basis:
if basis_elem.index > delta.max:
basis.push(basis_elem)
break
def Patch(basis, deltas):
"""Apply a delta stream to basis iterator, yielding original"""
if not isinstance(basis, CachingIter): basis = CachingIter(basis)
for d in deltas:
for elem in RList.patch_once(basis, d): yield elem
for elem in basis: yield elem
def get_difference_once(basis, delta):
"""From one delta, find differences from basis
Will return pairs (basis_elem, new_elem) where basis_elem is
the element from the basis iterator and new_elem is the
element from the other iterator. If either is missing None
will take its place. If both are present iff two have the
same index.
"""
# Discard any elements of basis before delta starts
for basis_elem in basis:
if basis_elem.index > delta.min:
basis.push(basis_elem)
break
# In range compare each one by one
di, boverflow, doverflow = 0, None, None
while 1:
# Set indicies and data, or mark if at end of range already
try:
basis_elem = basis.next()
if basis_elem.index > delta.max:
basis.push(basis_elem)
boverflow = 1
except StopIteration: boverflow = 1
if di >= len(delta.elemlist): doverflow = 1
else: delta_elem = delta.elemlist[di]
if boverflow and doverflow: break
elif boverflow:
yield (None, delta_elem)
di = di+1
elif doverflow: yield (basis_elem, None)
# Now can assume that everything is in range
elif basis_elem.index > delta_elem.index:
yield (None, delta_elem)
basis.push(basis_elem)
di = di+1
elif basis_elem.index == delta_elem.index:
if basis_elem.data != delta_elem.data:
yield (basis_elem, delta_elem)
di = di+1
else: yield (basis_elem, None)
def Dissimilar(basis, deltas):
"""Return iter of differences from delta iter and basis iter"""
if not isinstance(basis, CachingIter): basis = CachingIter(basis)
for d in deltas:
for triple in RList.get_difference_once(basis, d): yield triple
MakeStatic(RList)
This diff is collapsed.
execfile("robust.py")
from __future__ import generators
import tempfile
#######################################################################
#
# rorpiter - Operations on Iterators of Read Only Remote Paths
#
class RORPIterException(Exception): pass
class RORPIter:
"""Functions relating to iterators of Read Only RPaths
The main structure will be an iterator that yields RORPaths.
Every RORPath has a "raw" form that makes it more amenable to
being turned into a file. The raw form of the iterator yields
each RORPath in the form of the tuple (index, data_dictionary,
files), where files is the number of files attached (usually 1 or
0). After that, if a file is attached, it yields that file.
"""
def ToRaw(rorp_iter):
"""Convert a rorp iterator to raw form"""
for rorp in rorp_iter:
if rorp.file:
yield (rorp.index, rorp.data, 1)
yield rorp.file
else: yield (rorp.index, rorp.data, 0)
def FromRaw(raw_iter):
"""Convert raw rorp iter back to standard form"""
for index, data, num_files in raw_iter:
rorp = RORPath(index, data)
if num_files:
assert num_files == 1, "Only one file accepted right now"
rorp.setfile(RORPIter.getnext(raw_iter))
yield rorp
def ToFile(rorp_iter):
"""Return file version of iterator"""
return FileWrappingIter(RORPIter.ToRaw(rorp_iter))
def FromFile(fileobj):
"""Recover rorp iterator from file interface"""
return RORPIter.FromRaw(IterWrappingFile(fileobj))
def IterateRPaths(base_rp):
"""Return an iterator yielding RPaths with given base rp"""
yield base_rp
if base_rp.isdir():
dirlisting = base_rp.listdir()
dirlisting.sort()
for filename in dirlisting:
for rp in RORPIter.IterateRPaths(base_rp.append(filename)):
yield rp
def Signatures(rp_iter):
"""Yield signatures of rpaths in given rp_iter"""
for rp in rp_iter:
if rp.isplaceholder(): yield rp
else:
rorp = rp.getRORPath()
if rp.isreg(): rorp.setfile(Rdiff.get_signature(rp))
yield rorp
def GetSignatureIter(base_rp):
"""Return a signature iterator recurring over the base_rp"""
return RORPIter.Signatures(RORPIter.IterateRPaths(base_rp))
def CollateIterators(*rorp_iters):
"""Collate RORPath iterators by index
So it takes two or more iterators of rorps and returns an
iterator yielding tuples like (rorp1, rorp2) with the same
index. If one or the other lacks that index, it will be None
"""
# overflow[i] means that iter[i] has been exhausted
# rorps[i] is None means that it is time to replenish it.
iter_num = len(rorp_iters)
if iter_num == 2:
return RORPIter.Collate2Iters(rorp_iters[0], rorp_iters[1])
overflow = [None] * iter_num
rorps = overflow[:]
def setrorps(overflow, rorps):
"""Set the overflow and rorps list"""
for i in range(iter_num):
if not overflow[i] and rorps[i] is None:
try: rorps[i] = rorp_iters[i].next()
except StopIteration:
overflow[i] = 1
rorps[i] = None
def getleastindex(rorps):
"""Return the first index in rorps, assuming rorps isn't empty"""
return min(map(lambda rorp: rorp.index,
filter(lambda x: x, rorps)))
def yield_tuples(iter_num, overflow, rorps):
while 1:
setrorps(overflow, rorps)
if not None in overflow: break
index = getleastindex(rorps)
yieldval = []
for i in range(iter_num):
if rorps[i] and rorps[i].index == index:
yieldval.append(rorps[i])
rorps[i] = None
else: yieldval.append(None)
yield IndexedTuple(index, yieldval)
return yield_tuples(iter_num, overflow, rorps)
def Collate2Iters(riter1, riter2):
"""Special case of CollateIterators with 2 arguments
This does the same thing but is faster because it doesn't have
to consider the >2 iterator case. Profiler says speed is
important here.
"""
relem1, relem2 = None, None
while 1:
if not relem1:
try: relem1 = riter1.next()
except StopIteration:
if relem2: yield IndexedTuple(index2, (None, relem2))
for relem2 in riter2:
yield IndexedTuple(relem2.index, (None, relem2))
break
index1 = relem1.index
if not relem2:
try: relem2 = riter2.next()
except StopIteration:
if relem1: yield IndexedTuple(index1, (relem1, None))
for relem1 in riter1:
yield IndexedTuple(relem1.index, (relem1, None))
break
index2 = relem2.index
if index1 < index2:
yield IndexedTuple(index1, (relem1, None))
relem1 = None
elif index1 == index2:
yield IndexedTuple(index1, (relem1, relem2))
relem1, relem2 = None, None
else: # index2 is less
yield IndexedTuple(index2, (None, relem2))
relem2 = None
def getnext(iter):
"""Return the next element of an iterator, raising error if none"""
try: next = iter.next()
except StopIteration: raise RORPIterException("Unexpected end to iter")
return next
def GetDiffIter(sig_iter, new_iter):
"""Return delta iterator from sig_iter to new_iter
The accompanying file for each will be a delta as produced by
rdiff, unless the destination file does not exist, in which
case it will be the file in its entirety.
sig_iter may be composed of rorps, but new_iter should have
full RPaths.
"""
collated_iter = RORPIter.CollateIterators(sig_iter, new_iter)
for rorp, rp in collated_iter: yield RORPIter.diffonce(rorp, rp)
def diffonce(sig_rorp, new_rp):
"""Return one diff rorp, based from signature rorp and orig rp"""
if sig_rorp and sig_rorp.isreg() and new_rp and new_rp.isreg():
diff_rorp = new_rp.getRORPath()
diff_rorp.setfile(Rdiff.get_delta_sigfileobj(sig_rorp.open("rb"),
new_rp))
diff_rorp.set_attached_filetype('diff')
return diff_rorp
else:
# Just send over originial if diff isn't appropriate
if sig_rorp: sig_rorp.close_if_necessary()
if not new_rp: return RORPath(sig_rorp.index)
elif new_rp.isreg():
diff_rorp = new_rp.getRORPath(1)
diff_rorp.set_attached_filetype('snapshot')
return diff_rorp
else: return new_rp.getRORPath()
def PatchIter(base_rp, diff_iter):
"""Patch the appropriate rps in basis_iter using diff_iter"""
basis_iter = RORPIter.IterateRPaths(base_rp)
collated_iter = RORPIter.CollateIterators(basis_iter, diff_iter)
for basisrp, diff_rorp in collated_iter:
RORPIter.patchonce_action(base_rp, basisrp, diff_rorp).execute()
def patchonce_action(base_rp, basisrp, diff_rorp):
"""Return action patching basisrp using diff_rorp"""
assert diff_rorp, "Missing diff index %s" % basisrp.index
if not diff_rorp.lstat():
return RobustAction(lambda: None, basisrp.delete, lambda e: None)
if basisrp and basisrp.isreg() and diff_rorp.isreg():
assert diff_rorp.get_attached_filetype() == 'diff'
return Rdiff.patch_with_attribs_action(basisrp, diff_rorp)
else: # Diff contains whole file, just copy it over
if not basisrp: basisrp = base_rp.new_index(diff_rorp.index)
return Robust.copy_with_attribs_action(diff_rorp, basisrp)
MakeStatic(RORPIter)
class IndexedTuple:
"""Like a tuple, but has .index
This is used by CollateIterator above, and can be passed to the
IterTreeReducer.
"""
def __init__(self, index, sequence):
self.index = index
self.data = tuple(sequence)
def __len__(self): return len(self.data)
def __getitem__(self, key):
"""This only works for numerical keys (faster that way)"""
return self.data[key]
def __cmp__(self, other):
assert isinstance(other, IndexedTuple)
if self.index < other.index: return -1
elif self.index == other.index: return 0
else: return 1
def __eq__(self, other):
if isinstance(other, IndexedTuple):
return self.index == other.index and self.data == other.data
elif type(other) is types.TupleType:
return self.data == other
else: return None
def __str__(self):
assert len(self.data) == 2
return "(%s, %s).%s" % (str(self.data[0]), str(self.data[1]),
str(self.index))
This diff is collapsed.
execfile("globals.py")
#######################################################################
#
# static - MakeStatic and MakeClass
#
# These functions are used to make all the instance methods in a class
# into static or class methods.
#
class StaticMethodsError(Exception):
pass
def MakeStatic(cls):
"""turn instance methods into static ones
The methods (that don't begin with _) of any class that
subclasses this will be turned into static methods.
"""
for name in dir(cls):
if name[0] != "_":
cls.__dict__[name] = staticmethod(cls.__dict__[name])
def MakeClass(cls):
"""Turn instance methods into classmethods. Ignore _ like above"""
for name in dir(cls):
if name[0] != "_":
cls.__dict__[name] = classmethod(cls.__dict__[name])
#!/usr/bin/env python
"""Read component files of rdiff-backup, and glue them together after
removing unnecessary bits."""
import os
def mystrip(filename):
"""Open filename, read input, strip appropriately, and return contents"""
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
i = 0
while(lines[i][:60] !=
"############################################################"):
i = i+1
return "".join(lines[i:]).strip() + "\n\n\n"
files = ["globals.py", "static.py", "lazy.py", "log.py", "ttime.py",
"iterfile.py", "rlist.py", "rdiff.py", "connection.py",
"rpath.py", "robust.py", "rorpiter.py",
"destructive_stepping.py", "increment.py", "restore.py",
"manage.py", "filelist.py", "highlevel.py",
"setconnections.py", "main.py"]
os.system("cp header.py rdiff-backup")
outfp = open("rdiff-backup", "a")
for file in files:
outfp.write(mystrip(file))
outfp.close()
os.system("chmod 755 rdiff-backup")
This diff is collapsed.
This diff is collapsed.
from __future__ import generators
execfile("manage.py")
#######################################################################
#
# filelist - Some routines that help with operations over files listed
# in standard input instead of over whole directories.
#
class FilelistError(Exception): pass
class Filelist:
"""Many of these methods have analogs in highlevel.py"""
def File2Iter(fp, baserp):
"""Convert file obj with one pathname per line into rpiter
Closes fp when done. Given files are added to baserp.
"""
while 1:
line = fp.readline()
if not line: break
if line[-1] == "\n": line = line[:-1] # strip trailing newline
if not line: continue # skip blank lines
elif line[0] == "/": raise FilelistError(
"Read in absolute file name %s." % line)
yield baserp.append(line)
assert not fp.close(), "Error closing filelist fp"
def Mirror(src_rpath, dest_rpath, rpiter):
"""Copy files in fileiter from src_rpath to dest_rpath"""
sigiter = dest_rpath.conn.Filelist.get_sigs(dest_rpath, rpiter)
diffiter = Filelist.get_diffs(src_rpath, sigiter)
dest_rpath.conn.Filelist.patch(dest_rpath, diffiter)
dest_rpath.setdata()
def Mirror_and_increment(src_rpath, dest_rpath, inc_rpath):
"""Mirror + put increment in tree based at inc_rpath"""
sigiter = dest_rpath.conn.Filelist.get_sigs(dest_rpath, rpiter)
diffiter = Filelist.get_diffs(src_rpath, sigiter)
dest_rpath.conn.Filelist.patch_and_increment(dest_rpath, diffiter,
inc_rpath)
dest_rpath.setdata()
def get_sigs(dest_rpbase, rpiter):
"""Get signatures of file analogs in rpiter
This is meant to be run on the destination side. Only the
extention part of the rps in rpiter will be used; the base is
ignored.
"""
def dest_iter(src_iter):
for src_rp in src_iter: yield dest_rpbase.new_index(src_rp.index)
return RORPIter.Signatures(dest_iter())
def get_diffs(src_rpbase, sigiter):
"""Get diffs based on sigiter and files in src_rpbase
This should be run on the local side.
"""
for sig_rorp in sigiter:
new_rp = src_rpbase.new_index(sig_rorp.index)
yield RORPIter.diffonce(sig_rorp, new_rp)
def patch(dest_rpbase, diffiter):
"""Process diffs in diffiter and update files in dest_rbpase.
Run remotely.
"""
for diff_rorp in diffiter:
basisrp = dest_rpbase.new_index(diff_rorp.index)
if basisrp.lstat(): Filelist.make_subdirs(basisrp)
Log("Processing %s" % basisrp.path, 7)
RORPIter.patchonce(dest_rpbase, basisrp, diff_rorp)
def patch_and_increment(dest_rpbase, diffiter, inc_rpbase):
"""Apply diffs in diffiter to dest_rpbase, and increment to inc_rpbase
Also to be run remotely.
"""
for diff_rorp in diffiter:
basisrp = dest_rpbase.new_index(diff_rorp.index)
if diff_rorp.lstat(): Filelist.make_subdirs(basisrp)
Log("Processing %s" % basisrp.path, 7)
# XXX This isn't done yet...
def make_subdirs(rpath):
"""Make sure that all the directories under the rpath exist
This function doesn't try to get the permissions right on the
underlying directories, just do the minimum to make sure the
file can be created.
"""
dirname = rpath.dirsplit()[0]
if dirname == '.' or dirname == '': return
dir_rp = RPath(rpath.conn, dirname)
Filelist.make_subdirs(dir_rp)
if not dir_rp.lstat(): dir_rp.mkdir()
MakeStatic(Filelist)
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment