Version Control Management
Introduction
Software projects come about because of the need for desktop or web applications to perform tasks for business. Many times these tasks are part of a larger business project. Every software project consists of some collection of textual software or scripts. This textual instruction is either compiled into an executable binary, packaged as a compressed deployment, or placed on a web server for direct interpretation. The instructional materials (code or script) are extremely valuable and change constantly over the course of time. Also, as the development team grows, this source code base can grow exponentially and will never be static again. And, across an enterprise, this moving mass of important documents should be managed centrally. If the enterprise consists of .NET and Java development groups, then at least the repositories should be centralized for each of these camps, i.e. the .NET and Java groups each have their own central repository. It is feasible, however for the .NET and Java groups to use the same source code repository.
Prerequisites or Considerations
There are several prerequisites and considerations for the implementation of source code management. The prerequisites will consist of server(s) for the repository hosting and software to manage the access, delivery, and committals. The server(s) will provide hosted access for the development staff via some port listener over the network. Each software versioning control product will have its own detail requirements for the server hosting. The overall requirement should be the implementation of an enterprise version control system.
A written process for the management of source code artifacts will be needed before anything is put in place. A clear understanding of the SCM product is not enough. This process should contain high-level instructions that show how multiple developers will work with their repository and how they will resolve conflicts. This step is critical. It is advised that you follow the successful model of other projects. You can find this information using Google searches. Do your homework, this is very important to the success of your project. Also, another option is the contractor vehicle. A contractor can be hired for the short term task of setting up your development environment using various technology stacks. This is somewhat time-consuming for developers are not already highly skilled in the use of these software tools, hence the contractor vehicle.
Planning and Choices
The first steps to planning an enterprise source repository would be to determine the exact number of projects within the enterprise and then estimate the disk space needed to store them all in one centralized repository. Extra space would also be required to account for growth and to host the software product. A good estimate would be 2 times the size of the entire collection of projects. Always be prepared to grow the file system or migrate to another quickly.
Next, determine your needs and choose an enterprise version control system. Consider the versioning model first. There are really three basic models. These are the file system model, lock-modify-unlock model, and copy-modify-merge model. Every source control versioning system employs one of these models. /p>
The file system model is really inefficient because it would never record versions. Every change would be a revision to the file in the repository or file system directly in this case. And, users changes would not be merged. Users would have to communicate frequently and even pass changes between each other and not through the so-called source repository itself. The second committer would overwrite any changes that had just been checked in. This system should never be mistaken for a true versioning control system. This model limits the decision now to one of two models.
The lock-modify-unlock model is used by some source control systems to keep users from modifying each other’s work. In a perfect world there would be no need for developers to change each other’s files as long as their work was properly managed. It still almost impossible to assign developers specific files on a project. Under the lock-modify-unlock method, if someone has a file locked or “checked out”, another developer cannot checkout the file to make changes. This user must wait for the “locked” user to unlock or “check in” the file before he can lock and modify the file. Visual SourceSafe employs this model.
This model is more restrictive that the copy-modify-merge model. E.g. if Robert checks out file A and forgets to check it in or worse, loses the file and the hidden file that works with the repository, and Susan needs file A right away, an administrator is needed to unlock the file so that Susan can modify it for a Friday afternoon build. Also, does anyone know what Robert was working on and do we care? We just overrode his checkout and allowed Susan to modify the file without talking to Robert first. It’s common for one developer to be working on one method and another working on another method. That’s okay, but for this model that doesn’t work well at all.
The copy-modify-merge model is the best choice for version control of textual source code files. The lock-modify-unlock model is too restrictive in the event that a developer has a file checked out and he’s on vacation or cannot otherwise be contacted. This model works better because it actually operates in reverse. The repository is not aware of any new or revised work until the commit is made. The only failure would be in the event that another developer has huge changes to a file and has not committed those changes. It makes sense to update frequently, build successfully, and commit often. The key is to never commit work that puts the build at risk. Many development teams use a continuous build process where an isolated build is run everytime anyone commits a change to the repository. This build should always be successful but in the event of a failure, notifications are sent out via email. Communications will then occur, the code will be corrected, and the fix for the broken build committed. This is a very well managed process. And, it does require good communications.
Resources
Resources that are needed for the versioning control system would be for continuous use and maintenance. Be sure that adequate disk space is allotted and disk space is available for growth. Backups of the repository are critical and necessary to the security of the repository in the event of disaster. The backup process is an important resource that’s integral to the safety of the enterprise’s source code base.
Once a version control system is in place, an administrator is needed for the management, backup, and maintenance of the source repository. For single project instances a developer may serve as an administrator, however enterprise solutions would require a dedicated administrator. A business could train an existing employee as the repository grew in size and project number. That might be an economical option. Many developers have extensive knowledge of the various source control versioning systems because they use them during their development. One could hire a developer/versioning administrator.
Specific Choices
While GIT is now becoming more popular with the independent contractors and developers than Subversion (SVN), SVN still stands to be the best corporate choice for an open source solution. Here’s an executive summary from a Forrester report on Version Control Systems by Carey Schwaber. The entire document was proprietary and is not available to the public (even though she’s talking about freely-available open source products).
“Some development teams need only simple version control, the base functionality of software configuration management (SCM) systems, to manage their application code. In the past many such teams have used the open source version control tool Concurrent Versions System (CVS), but CVS’ file-based architecture saddles it with many serious limitations. Subversion, an open source tool designed as a replacement for CVS, has been in development since 2000 and is now ready for mainstream adoption. Shops evaluating version control tools should put Subversion on their shortlists, and shops using commercial version control systems like Microsoft Visual SourceSafe should consider whether Subversion might meet their needs just as well. The availability of an open source tool without many of CVS’ pains will accelerate commoditization of the market for commercial version control tools. But while Subversion is a better tool than CVS, it’s still just a version control tool, and it shouldn’t be used in place of a full-featured SCM system.” … Cary Schwaber for Forrester
Cary’s last statement was a marketing one. You want to read on. And, you can if you pay up. She assumes that you need a full-featured SCM by the way she ends her last sentence. An enterprise can implement such a system and still use Subversion for it’s versioning control component. Commercial vendors are quick to assume that an enterprise needs a full-featured SCM system. This is entirely not true. Vendors of these products want to convince you that their integrated systems blend requirements, issues, versioning, documentation, management, etc. making development the focus and reducing the burden of trivial administrative functions. In fact, manually developed systems work fine too. A project could be completed successfully and never use an SCM tool.
Another document, “The Forrester Wave: Process-Centric Software Configuration Management, Q4 2005, also by Cary Schwaber, quite possibly paid for by IBM, is freely available to the public. It does not recommend any open source products. In fact, it does not consider any solution that isn’t company backed with a minimum of $25 million in total annual revenues. This document recommends an integrated IBM solution using all of their SCM products. Does this make IBM the better choice just because they meet the $25 million annual revenue minimum? Does the revenue minimum inherently represent worth and quality? I know this axiom not to be true.
IBM’s Clearcase is a versioning system just like Subversion, however they have several other products when installed that can be considered a full-featured SCM system. The IBM products have high licensing costs while Subversion alone is free. A full-featured SCM system is not needed for agile development methodologies. The IBM “need” is to market the purchase of their products so that your enterprise can complete the project successfully that would have failed had you not purchased their software. This is wrong. Many successful software development projects use incompatible applications for version control, bug tracking, and documentation. My personal recommendation for both the .NET and Java groups would be host Subversion’s version control system on Redhat Linux server(s), Atlassian JIRA enterprise issue tracking, and Atlassian Confluence enterprise wiki.
Cost of ownership can direct choices. License costs really don’t balance with administrative costs. Proprietary and open source solutions require trained administrators. Proprietary solutions come with steep licensing costs. Open source solutions generally are free, however security comfort is sometimes compromised. This is only due to the fact that the enterprise may have only recently considered open source alternatives. Open source is generally secure because it’s very dynamic and the source code is freely available to anyone, including vulnerability hackers.
Conclusions
Every IT enterprise creates textual artifacts that range from configuration scripts to computer language code. Every enterprise should maintain versions of these files. To do this the enterprise needs a central repository and versioning software. The most important factors that face the deciding enterprise should be model first, and cost second. Also, access to the repository is accessible to anyone that’s authenticated and authorized to look at it. Developers that only maintain code locally essentially hide their work from everyone. Source code should be shared. And, even if it’s not open source, it should be shared among similar language developers within the company for re-use if licensing permits.
I have equal experience with each of the versioning control systems. The copy-modify-merge model is my favorite and has been the least troublesome. I’m personally a command-line person and that’s why Git is my absolute favorite, Subversion or SVN second, CVS third, and Visual SourceSafe fourth. I have used Borland’s Starteam for 2 years with CIBER and absolutely hated it. Customizations were of no benefit to our project. Daily use of Borland’s product also required tweaking hidden files essentially proving that the product was unstable even though the license costs should represent high quality.
Copyright Cape Henry Technologies Inc. 2010, All Rights Reserved.