vcs:cvs2svnconversion

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
vcs:cvs2svnconversion [2008/07/26 16:24] – Page moved and renamed from svnmigration to vcs:cvs2svnconversion lsmithvcs:cvs2svnconversion [2019/08/26 16:54] (current) gwynne
Line 1: Line 1:
 +====== Complete ======
 +
 +The SVN migration was completed in July 2009. This document has been retained for historical purposes.
 +
 +
 ====== CVS to SVN Migration Path ====== ====== CVS to SVN Migration Path ======
  
Line 32: Line 37:
 I have the CVS repository stored in a directory called "realroot" (why not?) and stored the temporary data files on a separate mount point because of drive space issues. My initial commandline, based purely on the documentation, before any kind of testing was this: I have the CVS repository stored in a directory called "realroot" (why not?) and stored the temporary data files on a separate mount point because of drive space issues. My initial commandline, based purely on the documentation, before any kind of testing was this:
 <code bash> <code bash>
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-cvs --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot+./cvs2svn --svnrepos=./svnroot --fs-type=fsfs 
 +--dry-run --no-cross-branch-commits 
 +--username=svnconvert 
 +--cvs-revnums --use-cvs 
 +--tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot
 </code> </code>
  
Line 99: Line 108:
 </code> </code>
  
-===== Pass 4 =====+===== Passes 4-8 =====
  
 Pass 4 worked on the first try: Pass 4 worked on the first try:
 +
 +Pass 5 was also a clean sweep:
 +
 +Pass 6 was just a beautiful thing, went by without a hitch. Not that I expected any of these "sort" passes to be a big deal, but you never know...
 +
 +Pass 7 worked without problems too, though it was a much slower pass than the last three. I finally started to feel like I was making progress. Pass 7 done, means pass 8 is up! Halfway there!
 +
 +Pass 8 took something along the lines of an hour to run, but it finally finished without errors... I hope the other phases aren't similarly insane with their timing. I'm considering taking the --quiet flag back out, and I think I will for pass 9. It's nice to know //something's// happening; I checked my top -u output twice during pass 8 to make sure it hadn't frozen up.
 +
 <code> <code>
 ----- pass 4 (FilterSymbolsPass) ----- ----- pass 4 (FilterSymbolsPass) -----
 Filtering out excluded symbols and summarizing items... Filtering out excluded symbols and summarizing items...
 Done Done
-</code> 
- 
-When pass 4 succeeded, my commandline was: 
-<code bash> 
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --quiet --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot 
-</code> 
- 
-===== Pass 5 ===== 
- 
-Pass 5 was also a clean sweep: 
-<code> 
 ---- pass 5 (SortRevisionSummaryPass) ----- ---- pass 5 (SortRevisionSummaryPass) -----
 Sorting CVS revision summaries... Sorting CVS revision summaries...
Line 122: Line 129:
 </code> </code>
  
-When pass 5 succeeded, my commandline was: +When passes 4-8 succeeded, my commandline was:
-<code bash> +
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --quiet --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot +
-</code> +
- +
-===== Pass 6 ===== +
- +
-Pass 6 was just a beautiful thing, went by without a hitch. Not that I expected any of these "sort" passes to be a big deal, but you never know... +
- +
-When pass 6 succeeded, my commandline was: +
-<code bash> +
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --quiet --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot +
-</code> +
- +
-===== Pass 7 ===== +
- +
-Pass 7 worked without problems too, though it was a much slower pass than the last three. I finally started to feel like I was making progress. Pass 7 done, means pass 8 is up! Halfway there! +
- +
-When pass 7 succeeded, my commandline was: +
-<code bash> +
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --quiet --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot +
-</code> +
- +
-===== Pass 8 ===== +
- +
-Pass 8 took something along the lines of an hour to run, but it finally finished without errors... I hope the other phases aren't similarly insane with their timing. I'm considering taking the --quiet flag back out, and I think I will for pass 9. It's nice to know //something's// happening; I checked my top -u output twice during pass 8 to make sure it hadn't frozen up. +
- +
-When pass 8 succeeded, my commandline was:+
 <code bash> <code bash>
 ./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --quiet --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot ./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --quiet --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot
Line 196: Line 176:
 </code> </code>
  
-===== Pass 10 =====+===== Passes 10-16 ===== 
 + 
 +10: Pass 10 was what my Warcraft friends would call "easysauce": 
 + 
 +11: Well, pass 11 was faster than 8 and 9, if not as fast as 10... 
 + 
 +12: Another one bites the dust! 
 + 
 +13: Pass 13 certainly was interesting. Generating all the SVN commits... 
 + 
 +14: Nice short easy one. 
 + 
 +15: 15 down, 1 to go! 
 + 
 +16: Pop the champagne cork!
  
-Pass 10 was what my Warcraft friends would call "easysauce": 
 <code> <code>
 ----- pass 10 (BreakSymbolChangesetCyclesPass) ----- ----- pass 10 (BreakSymbolChangesetCyclesPass) -----
Line 204: Line 197:
 Done Done
 Time for pass10 (BreakSymbolChangesetCyclesPass): 181.9 seconds. Time for pass10 (BreakSymbolChangesetCyclesPass): 181.9 seconds.
-</code> 
- 
-When pass 10 succeeded, my commandline was: 
-<code bash> 
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot 
-</code> 
- 
-===== Pass 11 ===== 
- 
-Well, pass 11 was faster than 8 and 9, if not as fast as 10... 
-<code> 
 ----- pass 11 (BreakAllChangesetCyclesPass) ----- ----- pass 11 (BreakAllChangesetCyclesPass) -----
 Breaking CVSSymbol dependency loops... Breaking CVSSymbol dependency loops...
 Done Done
 Time for pass11 (BreakAllChangesetCyclesPass): 1039 seconds. Time for pass11 (BreakAllChangesetCyclesPass): 1039 seconds.
-</code> 
- 
-When pass 11 succeeded, my commandline was: 
-<code bash> 
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot 
-</code> 
- 
-===== Pass 12 ===== 
- 
-Another one bites the dust! 
-<code> 
 ----- pass 12 (TopologicalSortPass) ----- ----- pass 12 (TopologicalSortPass) -----
 Generating CVSRevisions in commit order... Generating CVSRevisions in commit order...
 Done Done
 Time for pass12 (TopologicalSortPass): 255.5 seconds. Time for pass12 (TopologicalSortPass): 255.5 seconds.
-</code> 
- 
-When pass 12 succeeded, my commandline was: 
-<code bash> 
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot 
-</code> 
- 
-===== Pass 13 ===== 
- 
-Pass 13 certainly was interesting. Generating all the SVN commits... 
-<code> 
 ... ...
 Creating Subversion r192256 (commit) Creating Subversion r192256 (commit)
 Done Done
 Time for pass13 (CreateRevsPass): 512.2 seconds. Time for pass13 (CreateRevsPass): 512.2 seconds.
-</code> 
- 
-So 192,256 commits in the CVS repository, including tagging and branching. That's a pretty impressive total over 12 years, don't you think? 
- 
-When pass 13 succeeded, my commandline was: 
-<code bash> 
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot 
-</code> 
- 
-===== Pass 14 ===== 
- 
-Nice short easy one. 
-<code> 
 ----- pass 14 (SortSymbolsPass) ----- ----- pass 14 (SortSymbolsPass) -----
 Sorting symbolic name source revisions... Sorting symbolic name source revisions...
 Done Done
 Time for pass14 (SortSymbolsPass): 9.787 seconds. Time for pass14 (SortSymbolsPass): 9.787 seconds.
-</code> 
- 
-When pass 14 succeeded, my commandline was: 
-<code bash> 
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot 
-</code> 
- 
-===== Pass 15 ===== 
- 
-15 down, 1 to go! 
-<code> 
 ----- pass 15 (IndexSymbolsPass) ----- ----- pass 15 (IndexSymbolsPass) -----
 Determining offsets for all symbolic names... Determining offsets for all symbolic names...
 Done. Done.
 Time for pass15 (IndexSymbolsPass): 6.344 seconds. Time for pass15 (IndexSymbolsPass): 6.344 seconds.
-</code> +----- pass 16 (OutputPass) -----
- +
-When pass 15 succeeded, my commandline was: +
-<code bash> +
-./cvs2svn --svnrepos=./svnroot --fs-type=fsfs --dry-run --no-cross-branch-commits --username=svnconvert --cvs-revnums --use-internal-co --retain-conflicting-attic-files --encoding=ascii --encoding=utf_8 --encoding=utf_16 --encoding=shift_jis --encoding=mac_roman --encoding=latin_1 --encoding=euc_jp --exclude=php4/CREDITS --tempdir=/Volumes/External/private/tmp/cvs2svn-tmp ./realroot +
-</code> +
- +
-===== Pass 16 ===== +
- +
-Pop the champagne cork! +
-<code>+
 Starting Subversion r192256 / 192256 Starting Subversion r192256 / 192256
 Done. Done.
Line 463: Line 389:
 </code> </code>
  
-Whew. That was going to make for a //very// long options file where it was very easy to make copypasta errors. I needed to add a little Python code. How does one do foreach (array(//blah blah blah//) as $item) { /* etc */ } in Python? So I went to my wife Sarah, who **does** know Python.+Whew. That was going to make for a //very// long options file where it was very easy to make copypasta errors. I needed to add a little Python code. How does one do foreach (array(//blah blah blah//) as $item) { /* etc */ } in Python? So I went to a close friend who **does** know Python.
  
 We came up with this rather handy little bit of code: We came up with this rather handy little bit of code:
Line 904: Line 830:
  
 Well, that didn't work very well. A full checkout of just php-src with all its tags and branches is well past 20G in HFS+. Forget the entire repository. Some of the tags in there are completely ridiculous, and the branching, the naming of the tags is just awful... but I digress. I decided the most obvious thing to do was to work with smaller pieces of the repository. So I picked up a checkout of ZendEngine2 and Zend. Well, that didn't work very well. A full checkout of just php-src with all its tags and branches is well past 20G in HFS+. Forget the entire repository. Some of the tags in there are completely ridiculous, and the branching, the naming of the tags is just awful... but I digress. I decided the most obvious thing to do was to work with smaller pieces of the repository. So I picked up a checkout of ZendEngine2 and Zend.
 +
 +===== Moving to svn.php.net =====
 +
 +It was time to work on a system with slightly more capabilities than mine; I logged into cvs.php.net (also svn.php.net) and started work there. I modified the cvs2svn options file accordingly, set up a blank SVN repository next to the CVS repository, took a snapshot of the CVS repository, and ran cvs2svn over the snapshot. I didn't run into any unexpected issues, which was a pleasant surprise. Next step was to check out each module in the SVN repository to find any problems such as that mentioned above with Selenium. A long and annoying process, but at least it's easy.
 +
 +===== Doing some checkouts =====
 +
 +Well, what's a cheap way to check out all the SVN modules and see whether there are problems, without overruning the limited hard drive space of my system? Answer: Shell script! I came up with this little gem:
 +
 +<code bash>
 +#!/bin/bash
 +
 +dirs=`ls $1`
 +
 +for dir in $dirs; do
 +        echo "Processing ${dir}..."
 +        if [ -d "$1/${dir}" ]; then
 +                svn co $2/"${dir}" >> ./checkout.log 2>&1
 +                if [ "$?" -eq 0 ]; then
 +                        echo "Successful on ${dir}." >> ./checkout.results
 +                else
 +                        echo "FAILED ON ${dir}!" >> ./checkout.results
 +                fi
 +                rm -Rf ./"${dir}"
 +        fi
 +done
 +</code>
 +
 +Worked like a charm. Pointed it at the CVS and SVN repositories, and kicked it into gear. A couple hours of tail -f checkout.log scrolling later, I had the following list of failures:
 +<code>
 +FAILED ON CVSROOT!
 +FAILED ON livingtags!
 +FAILED ON pear!
 +FAILED ON pear-manual!
 +FAILED ON phpdoc-ar-only!
 +FAILED ON phpdoc-he-only!
 +FAILED ON phpdoc-ro-dir!
 +FAILED ON phpdoc-ro-only!
 +FAILED ON phpdoc-tr-dir!
 +FAILED ON zend!
 +</code>
 +
 +Every single one of those //except// pear was a nonexistent module, empty in CVS and ignored entirely in the SVN conversion. That left the pear module. Sure enough, the expected failure in Selenium and Testing_Selenium from someone who checked in .svn directories to CVS for some unknown reason. They were easily removed with a direct svn rm command:
 +
 +<code bash>
 +$ sudo -u svn \
 +svn rm -m "[SVN CONVERSION] Removing .svn directories that break SVN checkout." \
 +  $SVNROOT/pear/Selenium/branches/shin/.svn \
 +  $SVNROOT/pear/Selenium/branches/shin/tests/.svn \
 +  $SVNROOT/pear/Selenium/branches/shin/tests/events/.svn \
 +  $SVNROOT/pear/Selenium/branches/shin/tests/html/.svn \
 +  $SVNROOT/pear/Selenium/branches/shin/docs/.svn \
 +  $SVNROOT/pear/Selenium/branches/shin/examples/.svn \
 +  $SVNROOT/pear/Selenium/tags/start/tests/.svn \
 +  $SVNROOT/pear/Selenium/tags/start/tests/events/.svn \
 +  $SVNROOT/pear/Selenium/tags/start/tests/html/.svn \
 +  $SVNROOT/pear/Selenium/tags/start/docs/.svn \
 +  $SVNROOT/pear/Selenium/tags/start/examples/.svn \
 +  $SVNROOT/pear/Selenium/tags/start/.svn \
 +  $SVNROOT/pear/Testing_Selenium/branches/shin/.svn \
 +  $SVNROOT/pear/Testing_Selenium/branches/shin/tests/.svn \
 +  $SVNROOT/pear/Testing_Selenium/branches/shin/tests/events/.svn \
 +  $SVNROOT/pear/Testing_Selenium/branches/shin/tests/html/.svn \
 +  $SVNROOT/pear/Testing_Selenium/branches/shin/docs/.svn \
 +  $SVNROOT/pear/Testing_Selenium/branches/shin/examples/.svn \
 +  $SVNROOT/pear/Testing_Selenium/tags/start/.svn \
 +  $SVNROOT/pear/Testing_Selenium/tags/start/tests/.svn \
 +  $SVNROOT/pear/Testing_Selenium/tags/start/tests/events/.svn \
 +  $SVNROOT/pear/Testing_Selenium/tags/start/tests/html/.svn \
 +  $SVNROOT/pear/Testing_Selenium/tags/start/docs/.svn \
 +  $SVNROOT/pear/Testing_Selenium/tags/start/examples/.svn
 +
 +Committed revision 279477.
 +$
 +</code>
 +
 +===== Meta-SVN! =====
 +
 +About this time I realized that a lot of things related to SVN would require version control //before// the repository was ready for use! Things like all the various scripts involved in the conversion itself, all the authorization data, the commit hooks, all the fun stuff. Putting these things into CVS would result in a bit of recursive failure. Putting them into the SVN repository I'd set up would interfere with the conversion, and besides, this was metadata, stuff that belongs in an equivelant to CVSROOT. Solution: A second SVN repository under much more restricted authorization control. I put in a request for a metasvn.php.net domain name and set up cvs.php.net's Apache to serve it from a separate repository. Then Wez and a couple others convinced me that was a stupid idea. There wasn't //really// any reason this stuff couldn't go into CVS, other than my ornery resistance to the older and less useful system.
 +
 +It was about this time that I had to study Git for another project and began to wonder if maybe it wasn't better than SVN, but I'm just not into the idea of learning an entirely new system and forcing everyone else to do the same. SVN maps 90% onto CVS commands... Git maps more like 40%. SVN is a good midway step to true distributed VCS, and there are plenty of Git/SVN interface tools.
 +
 +So I set up a CVS module called SVNROOT/, got karma to it, and checked in my options file along with the checkout script above. Almost immediately I got an interesting question:
 +
 +"Didn't we decide to use PHP instead of Python?"
 +
 +Yes, we did. And yes, the options file is written in Python. Unfortunately, the way cvs2svn is set up makes this necessary; it includes the options file similarly to a PHP include directive.
 +
 +===== Reorganization ====
 +
 +Next step: Decide on a repository structure. Ooops... lots of differing opinions on that.
 +
 +Well, this was getting complicated. It was time to step back and automate some of the process. So I popped open a new PHP file and came up with automation for the svn create, cvs2svn, and svn rm commands already discussed. Then I went back and added some nice command-line-y-ness to it using PEAR's Console_CommandLine (a VERY nice package, kudos to its author(s)!). The script can be viewed at [[http://cvs.php.net/viewvc.cgi/SVNROOT/run-conversion.php?view=log]].
 +
 +That done, I looked back at the reorganization mess. It looked like there would in fact be a few separate repositories for things like PEAR and GTK. I needed advice on this one, so I went to the mailing list. They wanted to know, "why separate repositories?" Well, it's a matter of maitenance, really. GTK, PEAR, Zend, they all have their own little quirks in the hook scripts and really it's just simpler and more elegant for them to have their own workspaces to play in rather than all this endless special-casing in the hooks and ACLs.
 +
 +So I rewrote the conversion script completely to support this premise, and contacted various people to find out what to do with the "miscellaneous" modules scattered all over the place. Turned out most of them either belonged alongside php-src or were just plain defunct! The choice was made not to convert defunct modules, since there is a plan to leave the CVS repository available in some form.
 +
 +===== Hook scripts =====
 +
 +At a glance it might seem that would be the end of it. But unfortunately, no. There are a lot of administrative tasks done by scripts in CVSROOT, all of which need to be ported to SVN equivelants. I decided it would be astute to make a list of what needed to be ported before actually getting into it! To do that, I grabbed a copy of CVSROOT itself and had a looksee. It turned out the following things needed conversion:
 +  * Access Control Lists - replaced by the SVN authz database
 +  * commitinfo.pl - I couldn't quite figure out what this was for. It seemed to write the name of the committed directory to a file. A little more investigation showed it to be part of the loginfo.pl automation
 +  * cvswrappers - Replaced by SVN's autoprops
 +  * loginfo.pl - Sends the e-mails to various mailing lists when commits happen
 +  * modules - Replaced by svn:externals and restructuring
 +  * readers - Replaced by SVN's authz database
  
 ===== Available for the curious ===== ===== Available for the curious =====
 Meanwhile, the converted PHP repository is now available via: Meanwhile, the converted PHP repository is now available via:
 <code bash> <code bash>
-$ svn co svn://phpsvn.gwynne.dyndns.org+$ svn co http://svn.php.net
 </code> </code>
  
-This will check out all the projects in the repository; it's suggested to specify a particular module like <nowiki><svn://phpsvn.gwynne.dyndns.org/php-src/trunk></nowiki>. Don't forget about svn ls!+This will check out all the projects in the repository; it's suggested to specify a particular module like [[http://svn.php.net/php-src/trunk]]. Don't forget about svn ls!
vcs/cvs2svnconversion.1217089462.txt.gz · Last modified: 2017/09/22 13:28 (external edit)