ClearCase Globally, Git Locally
Thanks
ClearCase deserves all the credit for encouraging me to learn and love Git. Branching and merging is as fast and painless as listing changesets between any arbitrary branch or point in time. Did I say fast? No longer bound by ClearCase’s dictations and laborious linear progression, one can work off line, rollback, and experiment in multiple branches, travel into the past, and explore limitless parallel dimensions. While inspired by some other solutions, I believe what follows is the cleanest imposition of a Git repository upon a ClearCase snapshot view.
Summary
The following setup, namely a combined snapshot cloned locally, allows Git to track a ClearCase snapshot view without external functions (such as rsync), minimizing hijacks, untracked files, and encourages somewhat standard workflows with both ClearCase UCM and Git, without putting any limits on what can be done with a Git repository. This recipe assumes that the reader is proficient with the Unix/Cygwin shell, Git, and ClearCase. In short, we will:
- Initialize a Git repository upon an existing pristine ClearCase snapshot view
- Clone the snapshot as a Git repository
- Track the master branch of the clone from the snapshot
- Perform all ClearCase rebases, updates, checkins, and deliveries only in the snapshot
- Work in branches of the Git clone
- Pull between the master branches of the snapshot and clone repositories
Getting started
It is easiest to start with a fresh ClearCase snapshot view (what we’ll call snapshot) of which we’ll modify the .git/info/exclude file (see below) to hide some of the ClearCase plumbing from Git. Then we add all the tracked content to a newly initialized git repository (which is one and the same as the ClearCase snapshot view), clone it, and remote track the clone’s master branch. Now the clone (by default) and snapshot are tracking each other:
$ cd snapshot $ git init $ cat >> .git/info/exclude .gitignore view.dat lost+found/ $ git add . $ git commit -m init $ git clone -o snapshot . ../clone $ git remote add -t master -m master clone ../clone
(The last step, ‘git remote add’, when last tested does not seem to work. It’s not necessary, but if we could get it to work, it could be cool.)
We will want to keep the snapshot pristine (see below). It should only be used as a staging area between upstream rebases and checkins, and downstream pulls. As far as possible, all development, branches, and merging should occur in the work clone. The snapshot could just as easily be used by multiple people, say a team working on a common project. In that case, I might recommend another bare clone. But here, we’ll assume all Git repositories (including the overlapping ClearCase snapshot) are locally used by one developer.
Git ignore
There are multiple ways to hide files from Git’s tracking view including .gitignore files scattered anywhere within the directory structure. This may very well be appropriate within the clone development branches to hide build artifacts, temporary files, etc. There, in the clone, you should consider whether you want to share the .gitignore files (before committing them) or add .gitignore to the root .gitignore file, thus ignoring itself.
In the snapshot, however, I’ve chosen to use the .git/info/exclude file instead because it is applied to the entire repository and is already hidden from Git’s tree. The snapshot has very different tracking requirements. We’ll want to filter out all of the Git and ClearCase plumbing such as view.dat, .vws, and any .gitignore files that try to come upstream from the clone. However, we generally want to be aware of everything that passes through our snapshot. To that end, we want to ignore very little and consider all untracked files. A file untracked by one system indicates a new addition in the other or a deletion the other doesn’t yet know about. No files should ever be untracked by both ClearCase and Git in the snapshot.
ClearCase update and rebase
I always run “Find modified files” from the ClearCase explorer before rebasing or delivering to ClearCase. Checkin, undo hijacks, etc, as appropriate to preserve a pristine snapshot. We’ll need to add (or remove) upstream changes after updating or rebasing the snapshot. Modified files can be easily added to the Git index on commit (with the -a flag).
# cd to snapshot # rebase or update from ClearCase upstream $ git status $ git add (/some/files) $ git rm (/old/files) # repeat above until ClearCase changeset is in the index (no untracked files) $ git commit -c "some comment"
Downstream development
Assuming we’ve been developing in one or many branches within the clone, eventually some changes are bound to emerge interesting and stable enough to share with others. We’ll use the master branch of clone to stage our merges before delivering a pretty package upstream. First, we’ll need to be in sync with the snapshot. If snapshot is not pristine, we need to get it to that state. If there are commits in snapshot not found in the clone’s master, we should pull (or rebase).
After clone’s master is equal to snapshot or contains a strict superset of changes, we can stage our changes in the master branch of clone. How we do that in the sidestream branches is completely up to you: rebase, pull, merge, squash, octopus, rebase -i. We commit a new feature into its own branch ready to be merged and pulled upstream. Our workflow might look something like this:
$ cd ../clone $ diff -r . ../snapshot (nothing) $ git checkout feature $ git rebase master # test, work, test $ git checkout master $ git pull feature $ git branch -d feature
Checkin to ClearCase
In the snapshot we’ll pull from clone’s master and deliver the changes upstream. We may have to manually add, remove, and checkout/in our changes to ClearCase. To help, we can tag before pulling and display the file names as a difference along with the status (new, deleted, modified).
$ cd ../snapshot $ git tag before $ git pull ../clone $ git diff --name-status before
(The ‘git remote add’ could have been handy here)
Automation
While the difference above could help us manually deliver upstream, the output of the last line above could very well be used in a script to deliver to ClearCase. Though I imagine procedures differ from environment to environment. I have not automated the ClearCase delivery myself, but here is a rough sketch:
$ git diff --name-status before > diff_before $ grep ^A diff_before | sed "s/^../cleartool mkelem /" $ grep ^M diff_before | sed "s/^../cleartool cc /" $ grep ^D diff_before | sed "s/^../cleartool rmname /" $ grep ^[ADM] diff_before | sed "s/^../cleartool ci /"
Pristine
Similarly, the pristine state could be checked with ‘git status’ and the ClearCase explorer. However, I’ve found the following commands helpful:
$ find . -type f -writable | grep -v lost+found | grep -v view.dat $ find . -type f -name *keep $ find . -type d -name *unloaded $ git diff --name-only HEAD $ git ls-files --others | grep -v .git | grep -v lost+found | grep -v view.dat $ cleartool ls -recurse -view -short | grep -v lost+found $ cleartool lsco -me -recurse -short $ cleartool ls -recurse | grep "\[hijacked\]"
Or a script which simplifies the above. This may evolve into a full git-clearcase tool if it proves useful:
$ pristine Checking writable... OK Checking artifacts... OK Checking Git status... OK Checking CC Untrack... OK Checking CC Checkouts...OK Checking CC Hijacks... OK
$ pristine --help usage: pristine [-[waguch]] [ dir... ] Flags: Check directories for... -w writable files (possibly hijacked) -a artifacts such as *.keep and *.unloaded -g Git status including untracked files -u ClearCase untracked files and directories -c ClearCase checked out files and directories -h ClearCase hijacked files and directories Note: The flags above are ordered considering speed and likelihood of failure (-w) to the slower operations (-ch). The ClearCase checks may be slower than other checks. -w is a reasonable substitute for -h although not technically the same (a file may be readonly and still hijacked)
Happy coding.
Comments
Hi,
Being quite fond of ClearCase, even though it needs a lot of setup adaption to work nice, I found this article interesting since distributed development interests me.
This concept works fine – if you do not care about history, traceability and similar concepts. I tried the suggested concept, and found out that git is not able to differ between (1) a rename and (2) deletion of a file/folder and addition of another. This “feature” – to my regret – makes use of the suggested procedure a nightmare for any Configuration Manager in ClearCase since it will sabotage the content in the ClearCase VOB.
Hi, I’m not sure I follow what you mean by lost history and traceability. Sure, if you commit a dozen times before checking in/delivering to CC, (or vise-versa) you loose the micro-history, but otherwise, I’m not sure I follow.
I agree, Git makes no distinction between moving and delete-and-recreating content. In both cases it is atomic. I find that “feature” (with or without quotes) excellent. How is that a problem?
Hi,
In ClearCase, every file and folder is en element. Everything that is done with that element is stored. If you move a file or folder, you can still – at any time – trace that move and see differences between branches of that element.
If you make a move, using the “git method” you suggest, you change the folder element twice. cleartool rmname removes an element from the folder and cleartool mkelem creates another element – of course with it’s own history, not in any way connected to its origin. This is a problem e.g. in bigger projects, where newcomers usually need to be able to see what others has done before and is doing in other branches.
What you probably need to do is to find out if something has been moved (git diff-index -M –name-status –cached) and then generate cleartool mv as well as checkout and merge any changes to an element.
If I get the time to write a script for that, I’ll send it over in some way.
Hi (same person?),
I see what you mean. Though, I think ‘my method’ is more sophisticated than the method above. Above, I only show the difference between any changeset in git vs. clearcase. It’s up to the local user to add, delete, move, etc. Git generally DOES KNOW that a file has been atomically moved (rather than independently deleted and recreated with a new name/location), so it should be possible to inform clearcase, albeit a practical nuisance. I have begun writing scripts locally which align git deletions with clearcase unloads, hijacks with modifications, etc, but its not robust enough yet to publish. I’ll give some thought to moves as well.
Cheers,
Alex
A python script for importing/exporting to Clearcase along similar lines:
http://github.com/charleso/git-cc/tree/master
Hope it helps.
Trying to find solution for different aspects of the same problem, maybe interesting for you
Thanks for this; I have been using it for a few months now, and it works pretty well for me.
However, I am not working on the new version of my project, and we created a new stream in clearcase for this version (I have no idea if this is the standard way of using clearcase; I am told that CC has no concept of trunk/branches).
I’m wondering if it is possible to make the new CC stream feed into a branch in the same git repository. I’m sure I can create a new repository, but I think it would be extremely useful to be able to switch branches in the same workspace, in case I need to switch back to the old branch to fix a bug or something.
I’m just not sure how to add the new view (in CC) as a new branch of an existing git repository. Any ideas?
Thanks!