One of my jobs today required creating a Subversion repository out of a subset of a larger repository. There are many reasons why this might be useful. In my case, it was to prepare a clean repository for conversion to another version control system. The original repository contained useful checkins in one particular directory, and useless checkins elsewhere. Since a quick Googling doesn’t bring up anything useful, I figured I’d document what I did. Most of this information can be found scattered through The Subversion Book, provided you know where to look.
We’ll suppose that
ORIG is a directory containing the original repository, and
NEW is the smaller repository containing only the directory
DESIRED/DIR from the original repository. The goal is to have the repository
NEW contain only the portion of the repository originally under (not including)
Step 1 is to dump the original repository, preserving only the desired commits. This results in a dump file called NEW.svnrepo:
svnadmin dump ORIG | svndumpfilter include DESIRED/DIR --drop-empty-revs --renumber-revs > NEW.svnrepo
Step 2 is to back up the dump file in case the editing in step 3 goes wrong:
cp NEW.svnrepo NEW.svnrepo.copy
Step 3 is to edit the
NEW.svnrepo file. This is the tricky step!
You should see something like this at the start of the dumpfile:
PROPS-END Node-path: calc Node-action: add Node-kind: dir Content-length: 0 PROPS-END
You want to delete the part in bold red font.
See also the section from the Subversion Book about repository filtering.
But you’re not done yet! You also want to remove the
DESIRED/DIR/ prefix when it appears after
Node-copyfrom-path:. What you do NOT want to do is a simple search and replace, because this will corrupt any files that contain that path, and corrupted files cause the repository import script to abort.
A sequence of vim substitution lines like the following should do the trick. The first two handle references to children of the desired directory. The last one handles property changes made on the desired directory itself.
:%s/Node-path: DESIRED\/DIR\//Node-path: / :%s/Node-copyfrom-path: DESIRED\/DIR\//Node-copyfrom-path: / :%s/Node-path: DESIRED\/DIR/Node-path: /
Step 4 is to create a new, empty repository:
svnadmin create NEW
Step 5 is to load the new repository from your edited dump file:
svnadmin load NEW < NEW.svnrepo
Step 6 is to import the repository from svn to bzr, using one of:
bzr svn-import NEW bzr-NEW
bzr2svn.py NEW.svnrepo bzr-NEW
Step 7 is to remove the dump files:
In my particular case, getting the new repository was a means to an end. Since I was moving to Bazaar, using specialized conversion tools was an option. My findings:
svnadmin verifyon the new repository from above works.
bzr svn-importon the new repository fails on Windows with shiny new bzr (1.12) and bzr-svn (0.5.2)
bzr svn-importon the new repository works on Ubuntu with old and rusty bzr (1.6.1) and bzr-svn (0.4.13)
bzr2svnon the new repository fails on Fedora with even older bzr (1.3.1)
bzr2svnon the ORIGINAL repository works on Fedora with even older bzr (1.3.1)
There’s not much rhyme or reason to what works and what doesn’t, so far as I can tell. With 20/20 hindsight, the easiest thing to do would have been to use
bzr2svn on Linux from the get-go, and avoid
svn-import on Windows.
bzr2svn is especially nice because it has a
--prefix option to import only a portion of a repository. That eliminates the need to muck around with
svndumpfilter in the first place.
But what’s done is done, and my pain is your gain!