System Design – ACID and CAP Theorem

Q1. What is the ACID in system design?

Answer: ACID stands for four properties that help DBMS (Database Management Systems) keep the database reliable and maintain the integrity of the data. A stands for Atomicity, C stands for consistency, I stands for Isolation, and D stands for Durability.

Q2. What does atomicity mean in ACID compliance?

Answer: Atomicity means every transaction in the database should be atomic. Either the transaction completes fully or entirely fails. No in-between state is possible.

Q3. What does consistency stand for in ACID compliance?

Answer: Consistency means that every transaction should adhere to rules. In case of violation of rules, it should get aborted. Data should be consistent before and after the transaction. Let us say we keep a rule that Employee ID should never be less than 1. In case of any update trying to make it 0 or -1, it should abort the transaction.

Q4. What is Isolation in ACID?

Answer: Isolation means that two transactions are isolated from each other. While one transaction is running on the data in the database, another should not come in between and be able to affect the value.

Q5. What is Durability in ACID?

Answer: Durability means that the changes in database transactions should be durable. If a write happens and a transaction is completed, that should stay as it is in that state even if the server crashes. That is database should be durable in any case.

Q6. How is Durability achieved?

Answer: Durability is achieved through methods such as write-ahead logging (WAL) and checkpoints.

WAL means that the log of records is written at a separate location before it is applied to the actual database. There are two steps: adding 10 and subtracting 5 to be performed in a transaction. These steps are written into persistent storage. In case of failure in between these steps, it is easy to recover to the previous stage as the actual database is not yet written and abort the total transaction.

It is not like WALs from persistent storage are all written at once, finally to the database. There will be checkpoints based on how it is designed where WALs till then are written to the actual database.

The above factors result in not only durability but also the consistency of the data. The entire transaction must be aborted if a rule is broken during the second step.

Q7. What is the CAP theorem?

Answer: CAP theorem is related to distributed data storage systems, i.e., wherever multiple servers are involved, particularly for scalability. The theorem got proposed by Eric Brewer in 2000 and later proven by Seth Gilbert and Nancy Lynch in 2002. Hence, this is also called Brewer’s theorem.

Per this theorem, any distributed data storage system can guarantee at most two of the three guarantees. It can never promise all three. The three properties are C – Consistency, A – Availability, and P – Partition Tolerance.

Q8. What is Consistency in the CAP theorem?

Answer: Consistency, as in the CAP theorem, means that when you issue a read on certain data, all the nodes in the distributed data storage system will have the same data to be provided, i.e., the last updated data.

Q9. What is Availability in the CAP theorem?

Answer: The servicing of requests should always be available. Even if a node fails, the request should receive a non-error response. The response may not have the latest content, but it should never fail.

Q10. What is Partition Tolerance in CAP theorem?

Answer: As the name suggests, there should be tolerance in case of partition. Due to any network problems, the nodes may become partitions. Even then, a response should be present for the incoming requests.

Q11. Is the CAP theorem visualized as a triangle? Why?

Answer: CAP theorem is visualized as a triangle with vertices representing C, A, and P. The edges represent CA, AP, and CP. This means that any distributed data storage system can only provide one of these edges, i.e., only two properties are guaranteed at the sacrifice of the third property.

Q12. What are the different systems for each of the two guarantees in the CAP theorem?

Answer: Below are some examples for each of the combinations of the guarantees in the CAP theorem.

Third party distributed system examplesCombination of properties in CAP Theorem
Amazon DynamoDBAP with eventual consistency
CP with reduced availability
Apache CassandraAP
Apache ZooKeeperCP
Google Cloud SpannerCP

Q13. What is the most sacrificed property out of the three properties in the CAP theorem?

Answer: Consistency is the most sacrificed out of the three properties. It is enough for the systems to be eventually consistent instead of always consistent. This means that when an inconsistency is seen, if within a certain time, if they become consistent again, it is manageable in many distributed systems.

However, consistency and partition tolerance can become more important based on the case, for example, financial transactions, inventory management, and coordination services.

Q14. How is the CAP theorem important in System Design?

Answer: Always ask about which is important while doing the system design, and accordingly prioritize two of the properties in CAP theorem, and keep a plan B for the third property. It should not happen that the third property is totally ignored, but a failover mechanism should be defined for the third property as well.