If you are a developer or into testing you would have to work with a source code repository also referred to as version control system. You might have used CVS, SVN and Perforce etc. In this post, I will be walking you through the foundation of version control system, which will help us to understand the key benefits and drawbacks of traditional version control systems.
How traditional VCS works
Traditional VCSs have a centralized server, which acts as a source code repository. In addition to store the code, these tools maintains the revision history for each commit.
In below diagram, we have 3 files A, B and C; and project changes are committed in four faces i.e. initial commit, second, third and fourth commit.
When there is a change to the file A and C in the second commit, only the difference or delta changes are applied to the initial file and stored. Similarly when there is another change performed on top of B and C files in third commit, only the delta changes from B and C are stored as part of the commit.
Notice that at any given point of time, if we want to get the current state of the entire code base, we will not be able to get since all we can get for a given commit is the delta changes from the last time the files was committed.
For example, when we compare two commits, we only get the files that have changes and we can see the actual changes done to the files, which we refer to as delta.
You can definitely ask for some previous version of a file(s), but you cannot ask for whole workspace what it was before few commits. If you want to see how the workspace looks at a given commit, you will have to take the base version and path with all the commits done till the desired commit. Hence you will be restricted to either get latest revision; or base revision. Thus the VCS stores only the differences.
Advantages of Traditional VCS
- All the source code is safely stored in a secure place on a centralized server.
- If the source code on the local machine is lost due to system or hard disk crashes, taking the code from the VCS can restore the source code.
- Authentication and authorization can be put in place on the VCS.
- The VCS takes care of managing the versioning, and will not allow for commit if there is any conflict.
- Maintains the commit log for commit information by developers.
Lets also go through some of the drawbacks of this approach.
Disadvantages of Traditional VCS
Since the VCS resides on the server and does all the heavy lifting on hosting the code, versioning the files, maintain commit log information, the client is confined only to get the latest copy of the code and work and finally commit the changes back to the VCS. The code on the client machine is mere a working copy and not the entire repository itself. Now lets see the major drawback of this approach. Lets me explain with the help of an example.
- Assume that there are 5 developers working with a VCS and they all work on features and commit their changes to the repository. Unfortunately, the VCS crash and there is no backup. We are left with restoring the VCS with last stable commit. Now since the revision history and commit log is maintained on the VCS server, and the codebase present on the developer’s machine is just plain working copy, there is no way we can confidently bring the VCS to the last committed state.
- This is a common scenario that we encounter. Almost all the operations that we perform like checking file diffs, committing, merging etc. all can be performed only when the VCS is up. For some reasons, if the VCS is temporarily down for example network outage, server maintenance, all users accessing the VCS will be blocked. Even simple operations like comparing files with previous versions cannot be done.
- Another drawback of this centralized repository is to do with branching. When we are working on a project, we have trunk or master branch which is considered as the source of truth and will be used to integrate with build tools like Jenkins. For development, we create a branch from the trunk, work on the branch, and test on the branch and finally merge the changes back to the trunk from where the tag is created and finally pushed for deployment. Notice that the branch is created on the server. Also if two developers want to work on a feature and want to collaborate with each other without impacting other’s code, they will have to resort to working on a branch, which again will be on the server. Hence over a period of time, the server will be bloated with many branches.
- Another drawback of VCS is with scalability. Imagine working on a open source project where thousands of developers need to work and there are many thousands of branches created. Merging the changes and managing the branches is a nightmare and cannot be achieved with traditional version control systems.
Centralized Version control tools play an important role in software development and continue to do so but when the project grows and for high available and critical project this might not be the best solution as there are other options that might be better. Understanding the drawbacks of a centralized VCS helps us to decide the factors which drives us to explore other options such as Distributed Version Control Systems such as Git, Mercurial etc.
Above article is contributed by one of this blog’s reader Pradeep Kumar (@pradeepkumarl). He is Software Developer having more than 10 years of experience and have worked with various version control tools like SVN, Perforce, ClearCase and Git. He is passionate about technologies and love to teach them. You can checkout one of his online course on Git – Novice to Expert.
Happy Learning !!