Friday, April 29, 2011

How do I convert simple non source controlled project backups into a versioned git repository?

I have been extremely naughty. I have been developing a piece of software (I'm the only developer) for a little while (O.K., it's over the course of a few years), but have not used any sort of source control.

I have resolved to use source control (git seems most likely, as the windows tools seem to have come on a lot on the last few months) from now on. What I do have is dated backups of the entire directory of my (.NET) solution.

What I would like to do is automagically have my backups visible in the revision history. It will be messy. Projects and files will have been added/removed over the course of the solution history. I'm not bothered about such problems as what I know to be renamed files being interpreted as removal of a file and addition of a new, unrelated one.

More generally my problem is: I have time ordered copies of a changing directory. Importing the first into git is easy I assume. But, I then want all subsequent copies of the directory to be merged, in date order, one at a time without me having to commit every sub-directory and file individually.

Is this possible, or is it just that I am punished for not using source control from the off?

Edit: If I go ahead with the 'commit all snapshots individually' method manually (there are less than 20 snapshots), is there a way (as Esko Luontola suggests I might want to) of overriding the commit dates with the dates I have for the snapshot. git commit does not appear to have a flag to allow this. Is there another way (I'm using Vista)?

Edit: In answer to my issue of using the original dates: You have to set the GIT_AUTHOR_DATE and/or GIT_COMMITER_DATE environment variables to override the use of current dates and times when performing the commit.

The reason there are two sets of variables (there are also GIT_(AUTHOR|COMMITER)_(NAME|DATE|EMAIL)) is to distinguish between, say, the author who emails a patch, and the maintainer who is actually doing the commits into the repo.

Note if using git extensions on VS: If you set (export varname="value") these variables using the 'git bash' command line, and then switch back to the GUI to do a commit, it seems to ignore them. You have to stay on the command line and run 'git commit' from there.

From stackoverflow
  • There might be an already automated way of doing this, but git should be smart enough to let you git init in your oldest backup and then repeatedly copy the .git folder to incrementally newer backups and create a commit with everything for each one. Scripting this should be pretty easy.

    Even better, you could pass a --git-dir= option or $GIT_DIR environment variable to git and have it use a repository, saving the copying step.

    Something like this:

    cd $FINAL_DIR
    git init
    
    export GIT_DIR=$FINAL_DIR/.git
    
    cd $NEXT_BACKUP
    git add -A .
    git commit
    # rinse and repeat
    
    araqnid : I'd suggest using "git add -A ." and not giving "-a" to git commit, it seems a bit clearer ... if your version of git add supports -A, that is. Alternatively, "git add ." followed by "git add -u" (to track both new and removed files).
  • I don't quite know why you don't want to just commit all snapshots individually. I mean, a shell script (or Perl, Python, Ruby, Tcl, whatever) to do that, is probably less than 5 lines of code and less than 10 minutes of work.

    Also, there is git load-dirs, which would allow you to cut that down to maybe 3 lines and 5 minutes. But you still have to load every dir indvidually.

    But, if you are so inclined, there is the git fast-import tool which is intended to make writing repository converters and importers easier. According to the manpage, you could write an importer in about 100 lines and a couple of hours.

    However, all this ignores the biggest problem: the value of a VCS lies not in the contents – you could just as well use regular backups for that – but in the commit messages. And no magic tool is going to help you there, you'll have to type them all in yourself … and more importantly, you'll have to remember exactly why you made every single little change over the last years.

    Jakub Narębski : There are example fast-import scripts in contrib/fast-import/, including import-zips.py (in Python) and import-tars.perl (in Perl).
    hasen j : -1 I don't think this answer is helpful, it's like "dude you're an idiot it's so easy just do it sheesh I can do it in 2 minutes with my eyes closed"
  • You can use example git-fast-import based tools distributed in git.git repository: import-zips.py (in Python) or import-tars.perl (in Perl), or use those script as a base of your own import script. You can find those scripts in contrib/fast-import/ directory.

  • Also check out the "A Custom Importer" section in the Migrating to Git chapter. which talks about this exact issue.

0 comments:

Post a Comment