Thursday, February 26, 2015

BASE

We know about acid and base. they are opposite in nature...

I am not here to learn about chemistry. Talk to me something related to database.

I am not talking chemistry either but the database engineers have nicely used the terminology BASE in contrast to ACID. ACID is the most discussed term in database world. But it is the concept the world of Relational Database Management Systems deals with. While in contrast NoSQL movement moved the pH of the database transactions to BASE.

Database transaction pH Scale
Here, I would like to discuss a bit on the CAP Theorem. This theorem was originally developed by Eric Brewer in 2000. Hence this theorem is also known as Brewer's Theorem.

CAP Theorem: This theorem deals with three desirable properties of distributed system. These are:


  • Consistency: A read sees all previously completed writes.
  • Availability: Reads and writes always succeed.
  • Partition tolerance: Guaranteed properties are maintained even when network failures prevent some machines from communicating with others.
Now CAP theorem states that a distributed system can never guarantee all three of them simultaneously.

CAP Theorem

In reality, its always a choice for 'two out of three' - CP, CA or AP.

Two out of three
NoSQL is a distributed system. So, it also has this limitation. So, they follow a pattern known as BASE.
This moves the pH of database transaction to higher pH values.


You have already taken this term a lot of times earlier. Would you bother to let us what BASE is ?
Well, NoSQL relies on a strategy to stick to Brewer's Theorem. It consists of the following properties.

  • Base Availability: This means that, the data will be available even in presence of errors in the system.
    This is achieved by distributed computing. Instead of storing the data in a single store and trying to maintain fault tolerance, NoSQL spreads the data across multiple storage systems with high degree of replication. This ensures availability of data and complete outage events are very unlikely to occur.
  • Soft state: Soft state indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model.
  • Eventual Consistency: This states that data will be consistent over time. NoSQL systems ensure that at some future point in time the data assumes a consistent state.
    BASE is optimistic and accepts that the database consistency will be in a state of flux.
So, ACID is better. We have all the desired properties there.
Every technology has its own trade off. So has NoSQL. It is upto the developer, who needs to design his/her system accordingly. As a developer, you have to analyze what is required in your system, do the feasibility study on it and chose technology accordingly.
For example, NoSQL is a bad choice where strong consistency is required (banking applications) while systems which have a need to store and access big data with no strong consistency requirement, NoSQL is a good choice (social networking applications).

We have talked a lot about CAP, we have some real world examples for the CP, CA and AP databases, we can have a look at the following diagram,
Databases in CAP intersection
I have a chart as well with some addtional examples,

  • Bigtable by google - CP
  • Hbase by Apache - CP
  • DynamoDB by Amazon - AP
  • SimpleDB by Amazon - AP
  • Voldemort by LinkedIn - AP
  • Cassandra by Facebook - AP

Well, that's all for now, we'll look into more  in next articles.

If this article gave you some more knowledge, would you like to share this with your network ?