How Version Control System Works?

If you are a developer or into testing you would have to work with a source code repository also referred to as a version control system. We might have used CVS, SVN and Perforce etc.

In this post, I will be walking you through the foundation of the version control system, which will help us to understand the key benefits and drawbacks of traditional version control systems.

1. How Traditional VCS Works?

Traditional VCSs have a centralized server, which acts as a source code repository. In addition to storing the code, these tools maintain the revision history for each commit.

In the below diagram, we have 3 files A, B, and C; and project changes are committed in four phases i.e. initial commit, second, third, and fourth commit.

Version Control System - Delta Changes

When there is a change to file A and C in the second commit, only the difference or delta changes are applied to the initial file and stored. Similarly, when there is another change performed on top of B and C files in the third commit, only the delta changes from B and C are stored as part of the commit.

Notice that at any given point in time, if we want to get the current state of the entire code base, we will not be able to get it since all we can get for a given commit is the delta changes from the last time the files was committed.

For example, when we compare two commits, we only get the files that have changed and we can see the actual changes done to the files, which we refer to as delta.

We can definitely ask for some previous version of a file(s), but we cannot ask for the whole workspace what it was before a few commits.

If we want to see how the workspace looks at a given commit, we will have to take the base version and path with all the commits done till the desired commit.

Hence we will be restricted to either getting the latest revision; or base revision. Thus the VCS stores only the differences.

2. Advantages of Traditional VCS

  1. All the source code is safely stored in a secure place on a centralized server.
  2. If the source code on the local machine is lost due to system or hard disk crashes, taking the code from the VCS can restore the source code.
  3. Authentication and authorization can be put in place on the VCS.
  4. The VCS takes care of managing the versioning, and will not allow for commit if there is any conflict.
  5. Maintains the commit log for commit information by developers.

Let’s also go through some of the drawbacks of this approach.

3. Disadvantages of Traditional VCS

Since the VCS resides on the server and does all the heavy lifting on hosting the code, versioning the files, maintaining commit log information, the client is confined only to get the latest copy of the code and work and finally commit the changes back to the VCS. The code on the client machine is mere a working copy and not the entire repository itself. Now let’s see the major drawback of this approach. Let me explain with the help of an example.

  1. Assume that there are 5 developers working with a VCS and they all work on features and commit their changes to the repository. Unfortunately, the VCS crash, and there is no backup. We are left with restoring the VCS with the last stable commit. Now since the revision history and commit log is maintained on the VCS server, and the codebase present on the developer’s machine is just plain working copy, there is no way we can confidently bring the VCS to the last committed state.
  2. This is a common scenario that we encounter. Almost all the operations that we perform like checking file diffs, committing, merging, etc. all can be performed only when the VCS is up. For some reason, if the VCS is temporarily down for example network outage, server maintenance, all users accessing the VCS will be blocked. Even simple operations like comparing files with previous versions cannot be done.
  3. Another drawback of this centralized repository is to do with branching. When we are working on a project, we have a trunk or master branch which is considered as the source of truth and will be used to integrate with build tools like Jenkins. For development, we create a branch from the trunk, work on the branch, and test on the branch and finally merge the changes back to the trunk from where the tag is created and finally pushed for deployment. Notice that the branch is created on the server. Also if two developers want to work on a feature and want to collaborate with each other without impacting the other’s code, they will have to resort to working on a branch, which again will be on the server. Hence over a period of time, the server will be bloated with many branches.
  4. Another drawback of VCS is scalability. Imagine working on an open-source project where thousands of developers need to work and there are many thousands of branches created. Merging the changes and managing the branches is a nightmare and cannot be achieved with traditional version control systems.

Centralized Version control tools play an important role in software development and continue to do so but when the project grows and for highly available and critical projects, centralized VCS might not be the best solution as there are other options that might be better.

Understanding the drawbacks of a centralized VCS helps us to decide the factors which drive us to explore other options such as Distributed Version Control Systems such as Git, Mercurial etc.

Happy Learning !!

Leave a Reply

Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions, and frequently asked interview questions.