% bup-split(1) Bup 0.27 % Avery Pennarun apenwarr@gmail.com % 2015-04-26
bup-split - save individual files to bup backup sets
bup split [-t] [-c] [-n name] COMMON_OPTIONS
bup split -b COMMON_OPTIONS
bup split \<--noop \[--copy\]|--copy\> COMMON_OPTIONS
COMMON_OPTIONS ~ [-r host:path] [-v] [-q] [-d seconds-since-epoch] [--bench] [--max-pack-size=bytes] [-#] [--bwlimit=bytes] [--max-pack-objects=n] [--fanout=count] [--keep-boundaries] [--git-ids | filenames...]
bup split
concatenates the contents of the given files
(or if no filenames are given, reads from stdin), splits
the content into chunks of around 8k using a rolling
checksum algorithm, and saves the chunks into a bup
repository. Chunks which have previously been stored are
not stored again (ie. they are 'deduplicated').
Because of the way the rolling checksum works, chunks tend to be very stable across changes to a given file, including adding, deleting, and changing bytes.
For example, if you use bup split
to back up an XML dump
of a database, and the XML file changes slightly from one
run to the next, nearly all the data will still be
deduplicated and the size of each backup after the first
will typically be quite small.
Another technique is to pipe the output of the tar
(1) or
cpio
(1) programs to bup split
. When individual files
in the tarball change slightly or are added or removed, bup
still processes the remainder of the tarball efficiently.
(Note that bup save
is usually a more efficient way to
accomplish this, however.)
To get the data back, use bup-join
(1).
These options select the primary behavior of the command, with -n being the most likely choice.
bup fuse
, bup ftp
, etc.--noop
, but also write the data to stdout. This can be
useful for benchmarking the speed of read+bupsplit+write for large
amounts of data. Incompatible with -n, -t, -c, and -b.~/.ssh/config
file. Even though the destination is remote,
a local bup repository is still required.bup split
will read the contents of each named git
object (if it exists in the bup repository) and split
it. This might be useful for converting a git
repository with large binary files to use bup-style
hashsplitting instead. This option is probably most
useful when combined with --keep-boundaries
.--keep-boundaries
, each file is
split separately. You still only get a single tree or
commit or series of blobs, but each blob comes from
only one of the files; the end of one of the input
files always ends a blob.$ tar -cf - /etc | bup split -r myserver: -n mybackup-tar
tar: Removing leading /' from member names
Indexing objects: 100% (196/196), done.
$ bup join -r myserver: mybackup-tar | tar -tf - | wc -l
1961
bup-join
(1), bup-index
(1), bup-save
(1), bup-on
(1), ssh_config
(5)
Part of the bup
(1) suite.