Version control and I go back a long way.
Back in the late 1990’s, I was working in the QA team at Sonic Solutions, and was asked to look into our build scripts and source code control system, to investigate what it would take to get us to a cross-platform development environment—one that didn’t suck.
At the time, we were running our own build scripts implemented in the MPW Shell (which was weird, but didn’t suck), and our version control system was Projector (which did suck). I ended up evaluating and benchmarking several systems including CVS, SourceSafe (which Microsoft had just acquired), and Perforce.
In the end we landed on Perforce because it was far and away the fastest and most flexible system, and we knew from talking to folks at big companies (Adobe in particular) that it could scale.
Recently I’ve been reading about some of the advantages and disadvantages of Git versus Mercurial, and I realized I haven’t seen any discussion about a feature we had in the Perforce world called change lists.
Atomic commits, and why they’re good
In Perforce, as in Git and Mercurial, changes are always committed atomically, meaning that for a given commit to the repository, all the changes are made at once or not at all. If anything goes wrong during the commit process, nothing happens.
For example, if there are any conflicting changes between your local working copy and the destination repository, the system will force you to resolve the conflict first, before any of the changes are allowed to go forward.
Atomic commits give you two things:
First, you’re forced to only commit changes that are compatible with the current state of the destination repo.
Second, and more important, it’s impossible (or very difficult) to accidentally put the repo into an inconsistent state by committing a partial set of changes, whether you’re stopped in the middle by a conflicting change, or by a network or power outage, etc.
Multiple change lists?
In Git and Mercurial, as far as I can tell there is only one set of working changes that you can potentially commit on a given working copy. (In Git this is called the index, or sometimes the staging area.)
In Perforce, however, you can have multiple sets of changes in your local working copy, and commit them one at a time. There’s a default change list that’s analogous to Git’s index, but you can create as many additional change lists as you want to locally, each with its own set of files that will be committed atomically when you submit.
You can move files back and forth between change lists before you commit them. You can even author change notes as you go by updating the description of an in-progress change list, without having to commit the change set to the repository.
Having multiple change lists makes it possible, for example, to quickly fix a bug by updating one or two files locally and committing just those files, without having to do anything to isolate the quick fix from other sets of changes you may be working on at the time.
Each change list in Perforce is like its own separate staging area.
So what’s the corresponding DVCS workflow?
While it’s possible with some hand-waving to make isolated changes using Git or Mercurial, it seems like it would be easier to accidentally commit files unintentionally, unless you create a separate local branch for each set of changes.
I understand that one of the advantages people think of philosophically with distributed version control systems, is that they encourage frequent local commits by decoupling version control from a central authority.
But creating lots of local branches seems like a pretty heavy operation to me, in the case where you just need to make a small, quick change, or where you have multiple change sets you’re working on concurrently, but don’t want to have to keep multiple separate local branches in sync with a remote repo.
In the former case cloning the repo to a branch, just to make a small change isn’t particularly agile, especially if the repo is large.
In the latter case, if you’re working on multiple change lists at the same time, keeping more than one local branch in sync with the remote repo creates more and possibly redundant work. And more work means you’re more likely to make mistakes, or to get lazy and take riskier shortcuts.
But maybe I’m missing something.
What do you do?
In this situation, what’s the recommended workflow for Git and Mercurial? Any experts care to comment?
Leave a Comment