Tech Twitter: Design Theorem # CAP Theorem

Consistency Availability Partition tolerance is three main aspects of the modern distributed data system.

The CAP theorem was coined by Eric Brewer in 2000 to address the standard way to maintain the network-based database systems. In the era or petabyte-scale data, it became immensely important to develop and maintain distributed data systems to main the load.

In this article, we will discuss the key points of the CAP theorem, how it is different from ACID and why it is important for the current technological landscape.

Key Points on CAP Theorem

1. Consistency

According to this theorem, all connected nodes of the distributed system see the same value at the same time and partial transactions will not be saved.

Suppose there are multiple steps inside a transaction and due to some malfunction some middle operation got corrupted, now if part of the connected nodes read the corrupted value, the data will be inconsistent and misleading.

So according to the CAP principle, we will not allow such a transaction. A transaction cannot be executed partially. It will always be ‘All or none.

If something goes wrong in between the execution of a transaction, the whole transaction needs to be rolled back.

2. Availability

According to this, the connected or distributed systems should remain operational all the time. There should be a response for every client request in the system irrespective of a particular node is being available or not.

Though in a practical scenario it is purely based on the traffic requirements. The key point of this is every functioning node must return a response for all read and write requests in a reasonable amount of time.

3. Partition tolerance

According to the partition tolerance policy, if a subpart of the network is compromised, the entire distributed system should not go down.

A system that is partition tolerance should recover fast from partial outrage. In practical scenarios partition tolerance cannot be an optional criterion, it should be maintained thoroughly.

So adhering CAP theorem became always a choice between high consistency and high availability.

Why CAP theorem is important?

After the internet boom in 2005, the size of data is growing exponentially day by day. At the early stages to maintain the ever-changing scale of data and plan the capacity properly the only option was to increase the capacity vertically which means adding more machines or increasing the machine capabilities.

But is not always feasible and cost-effective. Instead of this, the new concept is to add the capacity horizontally which means leveraging distributed computing. To standardize the network, we need to maintain the principles of the CAP theorem.

We cannot maintain all three principles of the CAP theorem simultaneously. Theoretically, we can maintain only CA, CP, or AP.

Consistency and Availability: These are systems with high consistency and very lesser downtime but the option of partition tolerance is not enforced. For example, network issues can down the entire distributed RDBMS system.

Consistency and Partition tolerance: These systems adhere to high consistency and partition tolerance but there is a risk of some data being unavailable. Ex. MongoDB.

Availability and Partition tolerance: These systems adhere to high availability and partition tolerance but there is a risk of reading inconsistent data. Ex. Cassandra.

How to CAP theorem is different from ACID properties?

Before we talk about the differences, let’s talk about the ACID properties in brief.

1. Atomicity

All changes to data are performed as a single operation. That is, all or none, either all of the operations are performed or one of them is performed.

For example, in an application we are transferring funds from one account to another, the atomicity property ensures that, if a debit is made successfully from one account, then the associated credit is also done to the other account.

2. Consistency

For each transaction, the system should move from one consistent state to another consistent state.

3. Isolation

All transactions should be executed in isolation from other transactions. During concurrent transaction execution, intermediate transaction results from parallel executed transactions should be mutually exclusive. Failure of one module should not affect another transaction

4. Durability

After every successful transaction, the changes made in the database should persist. Even if the system comprises or failed somehow, still the successfully committed or aborted operations should persist.

Now we can see that, these terms technically refer to different things. The way in which they are related is that a distributed database system that guarantees the ACID transactions must choose consistency over availability according to the CAP Theorem (i.e it is a CP system).

On the other hand, If a distributed database chooses availability over consistency in accordance with the CAP Theorem (suppose. it is an AP system), it cannot tightly follow the properties of the ACID principles.

Conclusion

In this article we have discussed the principles of CAP theorem and why this is still important in the current context. We also discussed how the CAP theorem differs or is related to another database design principle (ACID).

In most practical use cases the principle of partition tolerance needs to be followed always and it becomes a choice between high availability and high consistency.

Kubernetes	Microservices
K8s_introduction Introduction To Docker & Docker-Swarm Mastering Kubernetes Design Patterns common_commands Deep Dive into Kubeproxy: Unraveling Its Inner Workings in Kubernetes Helm KubeApiServer QoS A Deep Dive into Kubernetes Sidecar, Init Containers & Container Communication A Comprehensive Guide to Different Types of Services in Kubernetes Troubleshooting Kubernetes Ingress vs Service Mesh What is Prometheush Simplifying Kubernetes Complexity with the Operator Pattern Dynamic kubernetes cluster scaling POWERFUL TOOLS TO MANAGE KUBERNETEST All k8s Post	MicroServices Design Patterns Reverse proxy v/s Forward proxy How To Implement Hystrix Circuit Breaker In Microservices Application? What is Externalized configuration - Build Once, Run Anywhere in Ms? What is Prometheus Monitoring system & time series database What is an API gateway and why is it important?
Python	AI/ML
Python libraries and frameworks Python Basic Concepts ALL Post Python Intermediate Concepts ALL Post	AI: Categories and Subcategories
Spring Framework	Spring Boot
Spring Framework- Introduction What is bean In Spring Framework? Inversion Of Control [IOC] Spring - Beans AutoWiring Spring - Bean Validations Spring - Event Handling Spring - Internationalization (I18N) Spring - Bean Manipulations or Bean Wrappers Spring - Property Editors Spring - Profiling Spring Expression Language – SpEL API & Example	Building A Dockerizing Spring Boot App Part1 - End-to-End data Encryption Using Public and Private Keys in java / Spring Boot Part2 - End-to-End data Encryption - Different methods of encryption using public and private keys Demystifying Role based JWT Authentication in Modern Web Applications using spring boot
Core Java	Java Coding Question
Java_Fundamentals Java_8_To_18_Features Design_Patterns_&_Principles Benefits of setting initial and maximum memory size to the same value StackoverflowError causes-solutions	Java8_Coding_Question String_Coding_Question Array_Coding_Question Stack_Coding_Question Queue_Coding_Question Linked_List_Coding_Question Binary_Tree_Coding_Question Binary_Search_Tree_Coding_Question Sorting_Coding_Question Graph_Coding_Question DynamicProgramming_Easy_coding_Question Dynamic_Programming_Coding_Question Miscellaneous_Programming_Coding_Question
Maven	AWS
Demystifying the Maven Build Lifecycle: Phases, Goals, and Custom Lifecycles Mastering Maven Profiles: Tailoring Your Builds with Precision Mastering Maven Plugins and Dependency Management with Spring Boot	AWS Basics service AWS Service Sketch AWS v/s Azure Service All AWS Post

Tech Twitter

Sunday, July 17, 2022

Design Theorem # CAP Theorem