Brewer’s CAP Theorem in Simple Words

When we start discussing the distributed architecture, there is a high possibility that we will encounter the CAP Theory (or Brewer’s theorem). CAP stands for Consistency, Availability and Partition tolerance.

CAP theorem wants system designers to make a choice between the above three competing guarantees in the final design. It’s said that achieving all 3 in the system is not possible, so we MUST choose at most two out of three guarantees in our system.

1. What is C-A-P?

The definition of CAP theorem seems simple and quick. But wait !! What do you mean by Consistency, Availability and Partition tolerance?? Let’s define these terms in a distributed computing environment.

1.1. Consistency

A service that is consistent operates fully or not at all. Consistency is ‘C’ in ACID properties in non-distributed systems, as applied to the ideal properties of database transactions. It means that data will never be persisted (rolled back) that breaks certain pre-set constraints.

Consistency, in a distributed environment, means all client programs who are reading the data from the cluster see the same data at any given point in time. Two clients fetching data from two nodes should not see different data at all.

1.2. Availability

Availability means – we should be able to retrieve the data that we have stored in the distributed system, no matter what happens inside the cluster. If we make a request, then we must get a response from the system; even if a node (or many nodes) in the cluster goes down.

1.3. Partition Tolerance

Partition Tolerance means that the cluster (as a whole) continues to function even if there is a “partition” (communications break) between two nodes (both nodes are up, but can’t communicate).

No set of failures less than total network failure is allowed to cause the system to respond incorrectly.

2. CAP Theorem in Practice

So the CAP theorem makes you choose between any two of 3 guarantees, we wish to add to our system. we cannot guarantee all 3 in a single distributed system. For example, in order to get both availability and partition tolerance, we have to give up consistency.

2.1. Usecase

We are asked to design a distributed cluster of 4 data nodes. The replication factor is 2 i.e. any data written in the cluster must be written on 2 nodes; so when one goes down – the second can serve the data.

Now try to apply CAP theorem to this requirement. In a distributed system, two things may happen at any time:

We will try to design our system based on these facts/assumptions.

2.2. CP [Consistency/Partition Tolerance] Systems

In a distributed system, at the time of reading the data, consistency is determined by a voting kind of mechanism, where all nodes who have a copy of data mutually agree that they have the “same copy” of requested data.

Now let’s assume that our requested data is present in two nodes N1 and N2. A client tries to read the data; and our CP system is partition tolerant as well, so an expected network failure occurred and N2 is detected as DOWN.

Now the system cannot determine whether N1’s data copy is the latest or not; it may be stale as well. So system decides to send an ERROR event to the client. Here system chose to prefer data consistency over data availability.

Similarly, at the time of writing the data if the replication factor is 2, then the system may reject write requests until it finds two healthy nodes to write data fully in a consistent manner.

2.3. AP [Availability/Partition Tolerance] Systems

What if in the above scenario, the system instead of sending ERROR (in case N2 is down); sends the data received from N1.

Well, the client gets the data, but was it the latest data copy stored in the system in past?? You cannot decide. You chose availability over consistency. These are AP systems.

2.4. CA [Consistency/Availability] Systems

In a distributed environment, we cannot avoid “P” of CAP. So we have to choose between CP or AP systems.

If we desire to have a consistent and available system, then we must forget about partition tolerance and it’s possible only in non-distributed systems such as Oracle and SQL servers.

CAP Theorem Example
CAP Theorem Example

In today’s world, we can achieve all 3 in a distributed system (if not fully, then partially at least). E.g. Tunable consistency in Cassandra

CAP Theorem is like the old joke about software projects: you can have it on TIME, in BUDGET, or CORRECT. Pick any two 🙂

Happy Learning!!


Comments are closed for this article!

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.