This article describes how to apply UCM with the SVNVCS.
Contents
- Introduction
- About VCS
- About ITS
- About UCM
- About SVN
- How to perform UCM using SVN
- Projects
- Streams
- Components
- Issue Tracking
Introduction
Proper change tracking and version control is vital for the success of any software project. This article describes the UCM approach to change control and how to implement it using SVN.
About VCS
Version Control Systems are vital for the success of any software project. Programmers must have possibilities to experiment with new ideas without impacting other team members and general development. Changes should be thoroughly tested before they are delivered to a common code base. It must be possible to see when and by whome each change to the software was introduced.
More elaborate environments require even more. Different variants of a product must be developed independently of each other. For instance, while one subteam is preparing the current code for a release and polishes away the last known bugs, another team at the same time continues development for a new version. Changes by the development team shouldn't affect the release team.
A relatively new approach requires that different variants of a product are made up of different components, which are tracked individually by the version control system. This is component based version control.
About ITS
Issue Tracking Systems, also known as Bugtrackers or Change Request Managers (there are slight differences between these terms which are out of scope of this document), allow developers and other participants in projects, like project managers, trackers, testers and quality assurance, to track the state about issues like open bugs or requested features.
About UCM
Unified Change Management is a relatively young approach to version control. It's main attributes are:
- Stream-oriented development and integration
- Activities
- Component oriented version control
- Close coupling with an issue tracking system
UCM gained much popularity in the context of IBM Rational ClearCase UCM, which is an extension of ClearCase to support the UCM approach of version control.
About SVN
Subversion is a light-weight VCS meant to replace CVS. CVS was the most popular VCS in open source development for a very long period of time. CVS is very easy to handle and for a version control system easy to understand. Besides that CVS also is extremely fast, in LAN environments as well as in WAN environments. Even over the Internet, CVS is very fast.
As a replacement for CVS, SVN is meant to provide all positive features of CVS as well as several new features. Apart from being more modern in supporting versioning of directories and file attributes, SVN provides three main attributes / advantages over CVS which also significantly differ SVN from most other VCS:
- Versions are applied to the repository, not single files or directories.
- This makes it extremely easy to identify all matching files and directories for a specific version of a particular file.
- SVN has cheap copies.
- The cheap copy mechanism replaces previous mechanisms for branching, tagging or labelling. A cheap copy is a copy of a whole directory tree made inside the repository. Branches, tags and labels are implemented using these cheap copies. The cheap copies are named cheap because creating a cheap copy is an operation with very very little overhead. It's nothing else like "I'm a copy of path/to/foo revision n". This also means that in SVN, attributes like branches, labels or tags merely are directories. This leaves much individual flexibility on how to organize version management to the projects, they can adapt it on their needs.
svn:externals
svn:externals
is a property which allows to link to other repositories or other locations in the same repository.
How UCM works in general
First of all, development is no longer parted in a main branch plus lots of changes to it, but changes are grouped in multiple ways. All changes that together make up a single feature or bug fix or similar group are grouped as "activity". An activity appears as issue item in the issue tracker, that means an activity is associated with an issue tracker item and vice versa. Activities are performed on a development stream, which is a copy of the integration stream, and then "delivered" (merged) into that integration stream.
Original UCM concepts
This subchapter describes the original UCM concepts as they were applied in ClearCase UCM. The goal is to be able to apply UCM with any version control system, particularly Subversion. For that, it is necessary to understand the intentions of UCM and its concepts, and why it is implemented in ClearCase the way it is.
ClearCase limitations
ClearCase is a very old-fashioned version control system. Basically ClearCase is RCS with improved multi-user and directory support, a network file system (as if there weren't already enough of them) and database backend. The elementary concepts staid the same over time. That means:
- Revisions are done on a per-element basis.
- There is no low-level support for atomic changesets.
- Conflicts are prevent with locking (reserved checkout).
ClearCase UCM is built on top of ClearCase. ClearCase UCM has to take into account these limitations and live with them. ClearCase UCM has to hide these limitations.
ClearCase UCM glossary
- Activity
- An activity is a group of changes that belong together. Purpose:
- They link changes together, putting single checkout / checkin operations into the context of a group called Activity.
- They link changes with an issue tracker.
- Baseline
- A baseline is a consistent version state of a component. Purpose:
- Keep those revisions of files and directories together which belong together.
- Component
- A component is a group of files and directories which are managable as single unit to be baselined. Purpose:
- Keep those files and directories together which belong together.
- Composite baseline
- A composite baseline is the combination of multiple baselines from several components. Usually, a project has a composite baseline which sums up the baselines for all participating components. Purpose:
- Keep those revisions of components in a project together which belong together.
- Delivery
- Delivery is the operation of merging changes from a child Stream to a parent Stream (usually the Integration Stream). Purpose:
- Perform merges on groups of files and directories instead of single files and directories.
- Keep changesets together.
- Integration Stream
- An Integration Stream is a Stream with child streams. For a Project, at least one Stream exists, which is its Integration Stream. Purpose:
- Have a branch on which all changes that are mature enough are available.
- Project
- A Project is a configuration of an Integration Stream and participating components in specific baselines. A project is configured by a composite baseline. Purpose:
- Prevent conflicts between different development goals of different projects.
- Allow the selection of components from a pool.
- Rebase
- Rebase is the operation of copying changes from a parent Stream (usually the Integration Stream) to a child Stream. Purpose:
- Perform merges on groups of files and directories instead of single files and directories.
- Keep changesets together.
- Stream
- A Stream is an isolated branch of development. Unless it is an integration Stream, a Stream always has a parent. Activities will only affect the stream for which they were performed. Streams are actively synchronized with each other by explicit Delivery and Rebase operations. Purpose:
- Allow changes under version control but in a controlled manner.
- Keep changesets together.
Putting UCM on a more abstract level
Many of the concepts which ClearCase UCM puts on top of ClearCase are only explicitely necessary to cope with the limitations of ClearCase. To be able to apply UCM with any version control system, it is necessary to understand the intentions of UCM and its concepts, and why it is implemented in ClearCase the way it is.
Goals on an abstract level:
- Developers shall be able to perform changes without interfering with other developers.
- Changes shall be controlled and tracked.
- Projects shall be setup in a way that they are built on reusable components.
- The reusable components shall be setup in a way that allows changing them without interfering with other projects.
Simple mapping UCM between ClearCase UCM and Subversion
For most of what is done with ClearCase UCM, the following simple mapping may already be sufficient.
Goal | ClearCase UCM | Subversion |
---|---|---|
Atomic changesets (small changes) | Activity | commit |
Atomic changesets (large changes) | Activity | directory copy, merge |
Consistent revisions | Baseline | implicit (path + repository revision) |
Reusable file group | Component | directory copy |
Consistent project version | Composite baseline | implicit (path + repository revision) |
Integrate developer work (small change) | Delivery | commit |
Integrate developer work (large change) | Delivery | merge --reintegrate |
Mature main branch | Integration Stream | implicit (e.g. policy for trunk) |
Consistent overall setup | Project | directory |
Up-to-date developer copy (small change) | Rebase | update |
Up-to-date developer copy (large change) | Rebase | merge |
Create place for independent developer work (small change) | create Stream | checkout |
Create place for independent developer work (large change) | create Stream | directory copy |
Just like having no concept of tags and labels, or branches, Subversion has no concept projects, components, streams and activities. Actually, that's good news. The concept of Subversion is to work with cheap copies. Originally, cheap copies were designed to be the Subversion approach to tagging and branching. Interestingly, the more abstract approach of cheap copies is not only a simple yet superior way to resemble tags and labels, and branches. It also is a simple way to resemble projects, components, streams and activities.
At this point, let me also have a word on whether you should have one large repository or multiple small repositories. Subversion is significantly faster than ClearCase. And because commits are path-based, it can use path-based transaction isolation. That means multiple proceses can safely work on the same repository at the same time. They will only delay each other if they affect the same path.
Also, projects like KDE show that it's possible to use Subversion for very large development projects with large development teams. At the time of this writing, the KDE subversion repository was at revision 969346.
How to perform UCM using ClearCase
As already mentioned above, there's a special version of ClearCase called ClearCase UCM. When applying UCM with ClearCase, you'll notice the following things:
- Working with ClearCase when the server is offline is near impossible.
- Dynamic views don't work with the server being offline.
- Snapshot views work with the server being offline, but this is very limited. Even restoring a file to its original version requires server access.
- VOB symbolic links don't work properly with snapshot views.
- Over WANs / Internet and VPNs, ClearCase is awfully slow.
- When you're working on your own stream anyway, you'll begin to ask yourself why this obsolete checkin / checkout locking mechanism is still required.
- ClearCase will create new file revisions of all files you've touched, even if you don't actively create new versions. A file, once touched, will grow new versions for every rebase you perform.
- Activities are somehow lost once they're integrated. They are visible on the development stream but they don't become visible on the integration stream. Instead, Deliveries and Rebases appear as separate activities. The faster the development process, the higher the percentage of these artificial pseudo-activities compared to real development activities.
How to perform UCM using SVN
Most of the UCM concepts can be provided by directories and cheap copies in SVN: projects, components, streams and, if you want to, activities. That means the implementation of UCM on top of Subversion is very flexible. It requires only little discipline, so no additional scripts should be required. Also, subversion is much more likely to forgive errors or mistakes in the setup. And it's easy to restructure the repository afterwards, so you can even turn a non-UCM-repository into a UCM-repository.
The flexibility of Subversion also means that the following is just one out of many possible ways to implement UCM on top of Subversion.
Original Subversion approach: TTB
/
trunk/
tags/
branches/
This is the original subversion approach, also known as
TTB structure. It already has a lot in common with UCM. Small development activities are separated by commits, large development activities can be performed on separate branches.
Single project / component approach
/
trunk/
tags/
branches/
streams/
(← This is new compared to the classical approach in subversion)
This is a slightly modified approach, which I call
TTBS(you guessed ;-). It separates streams / developer branches from (release) branches.
Distinct Multi-Project approach
/
- projectname
trunk/
tags/
branches/
streams/
(← This is new compared to the classical approach in subversion)
- projectname
Often, multiple projects shall be supported in the same repository. For that, it is a good idea (and common practice) to have the projects on top level and the TTB(S) structure below that.
If your component model works on mature objects only, which is the case for many Java projects, this is enough: If one project depends on the other, it waits for a release (.jar, .war, .ear or so).
However, in some development environments, it takes days or even weeks for a project to make a release for just one minor change. In such an environment, if project A depends on project B, project A might be better off with a copy of project B. That's the next approach.
Establish Reuse
Reuse means to have something available which was created elsewhere. For reuse, subversion offers two possibilities, both of which work fine on directories (and single files, if wanted):
- cheap copies
- svn:externals
If you want to reuse something in an unchanged form, the best way of doing that is using
svn:externals
. If you want to change it, the best way of doing that is using cheap copies (svn cp
).Separate component / project approach
/
projects/
- projectname
trunk/
tags/
branches/
streams/
- projectname
components/
- componentname
trunk/
tags/
branches/
streams/
- componentname
A component can be made visible anywhere within a project, either as cheap copy or as external.
How to apply UCM to SVN - The Commands
Creating a Project or Component
As a project or component is just a directory, it's created like trunk: Create the directory, if you start from scratch, or import something that you already have.
Creating a Stream
To create a Stream, create a copy from what you want to create a stream for, and place that copy in streams/.
Synopsis:
svn cp parentPath streams/streamName
Example stream from trunk:
svn cp -m "Creating Stream barBuzz." http://svn.mynet.local/projectFoo/trunk http://svn.mynet.local/projectFoo/streams/barBuzz
Rebasing a Stream
Starting with Subversion 1.5, rebasing a stream becomes very simple. Use the
svn merge
command for that.Synopsis:
svn merge parentPath
Example:
cd barBuzz; svn merge http://svn.mynet.local/projectFoo/trunk
Delivering a Stream
First you rebase your stream to make sure you deliver a consistent set. Then you go to trunk and deliver using the
svn merge --reintegrate
command.Synopsis:
cd parentPath ; svn merge --reintegrate streams/streamName
Example:
cd ../../trunk ; svn merge --reintegrate http://svn.mynet.local/projectFoo/streams/barBuzz
Once the deliver merge is done, check the delivery and if everything is okay, complete it by performing a commit.
As soon as the delivery is done, the stream should be removed. It is neither possible nor desired from a subversion point of view to reuse a stream once a delivery is done. If you want to continue working on that stream after the delivery, like some developers do in ClearCase UCM, just recreate the stream after deleting it.
Removing a Stream
To remove a stream, simply delete the stream like a normal directory. Removing a stream will not reduce the server load, it will just make the stream invisible, nothing more.
Example
This chapter describes an example of how development could work, and it also shows how a subversion repository can be transformed to the different parts of UCM as needed.
The beginning is an application called "JEduca", an application for creating and running tests based on multiple choice questions. The developers are Alice, Bob and Charly.
Alice creates a repository for them:
svnadmin create /home/alice/svnroot
The developers that begin to work on JEduca, execute:
mkdir ~/svn cd ~/svn svn checkout file:///home/alice/svnroot JEduca
Whenever they want to access changes that the others made, they perform a rebase:
svn up
Whenever they have finished an activity, they perform a delivery:
svn commit -m "Activity description."
After a few days, Alice and Bob want to create the first release, while Charly wants to continue working on new features. They decide it's time for a branch. Prior to branching they have to perform a small change to the repository and make it ready for branches: Bob executes:
files=$(echo *) svn mkdir trunk tags branches svn mv $files trunk/ svn commit -m "Created TTB structure."
The sandboxes now have should be relocated to trunk, because the position of trunk has virtually changed.
Now that the repository is raedy for branches, Bob creates the branch for the 0.1 release:
svn cp -m "Create release branch for 0.1." file:///home/alice/svnroot/trunk file:///home/alice/svnroot/branches/0.1
The story continues with the next update.