Friday, March 6, 2015

NoSQL Types Deep Dive

'Every time you dive, you hope you'll see something new - some new species. Sometimes the ocean gives you a gift, sometimes it doesn't.' - James Cameron

Well, the world of knowledge also work in the same way. But in most cases, it gives us gift. Now we are going to deep dive on the NoSQL Types. After the previous post, we want to deep dive to get some more in this area of Big Data.

We learnt that, we can categorize NoSQL Databases in 4 major types and we have some basic knowledge on them. Let's try to find some more idea on them. Let's take each of them one by one.

  1. Key-Value pair databases - The simplest of all the four categories. Conceptually we can look at them as HashMap<Key, Value>, where the key is the primary key for the value to be stored and the value is the raw data. Well, this value can be anything. The database just stores the value blindly without even caring what's inside.
    As more like a HashMap<Key, Value>, the set of operations is also somewhat analogous. We can get the value of a key, put to the database a key-value pair or simply can delete the value associated with a key. Query is only possible through the key itself. Mappings are usually accompanied by cache mechanisms to maximize performance.

    Key-Value
      Pros:
      Due to the use of single primary key access, this type provides a better performance and scales incrementally.

      Cons:

      We can not query this database based on value, all the accesses must be done through the primary key. It is upto the application to understand, what it originally stored and how to process the value on retrieval.
      Implementing relationships between data is not recommended with this type.
      Since there is no column in the database, updating part of the data is cumbersome.

      Use cases:
      Key-value databases are best utilized in the following situations:
      •Storing user session data
      •Maintaining schema-less user profiles
      •Storing user preferences
      •Storing shopping cart data
    1. Document Database - This one is my favourite data store, we'll go through this type in deep detail in the next sections. In fact, this type provides the flexibility to migrate to NoSQL from RDBMS. This type allows the data to be stored in a semi-structured way. A document simply refers to a piece of data which has multiple attributes attached to it. The tricky part is, different document can have different architecture or they may be the same throughout the whole application. Application has the flexibility to add or remove attributes in the document on the fly.
      This type works on XML, JSON, BSON data which is easier to map with memory representation of object which is really helpful for Object Oriented Programming language like Java. Storing of database is also different than key-value pair. Document Databases don't store values blindly, they know about architecture of the data as well also store the metadata. So, query on the data is possible with this type. Interestingly, Document Store has the capacity to store document within another document as the backbone data representation(XML, JSON, BSON) of this type supports this capacity.

       {  
            _id : 1,  
            name : 'Palash Kanti Kundu',  
            occupation : 'Software Engineer',  
            organization : [ 'HCL Technologies', 'Cognizant Technology Solutions' ],  
            address : [ {  
                 _id : 123456,  
                 type : 'Current',  
                 city : 'Kolkata',
                 zip : 700098  
            }, {  
                 _id : 156,  
                 type : 'Permanent',  
                 city : 'Barddhaman'  
            } ]  
       }  
      Document Data
      Use cases:
      Document Store databases are useful when you have to implement
      •Content management systems
      •Blogging platforms
      •Analytics platforms
      •E-commerce platforms
    2. Column Family store - Column-family databases are row-based databases. In this type of database data is stored in rows that have a unique row id, and instead of documents and ‘value’ like in Key-value store and document store databases, the data is stored in form of flexible columns.
      The key difference between Column Store and SQL database is that in Column-store you don’t have to maintain consistent column numbers. You can add a new column to any row without having to add them in all the rows of the database. Because of its similarity to SQL databases, column store are easier to query than previously mentioned NoSQL databases but they are not as flexible in storing random information like document store or key-value store.
      Column Family database
      Use Cases:
      Developers mainly use column databases in
      •Content management systems
      •Blogging platforms
      •Systems that maintain counters
      •Services that have expiring usage
      •Systems that require heavy write requests
    3. Graph databases - Connections are the main theme of this type. As a backbone, Graph Theory is implemented with concepts of nodes, edges, properties. Algorithms like BFS, DFS are used to find the shortest path connections. This type is extremely useful in connected data architecture.
      This type provides great flexibility while querying relational data and also supports index free searches.
      Graph databases
      Use cases:
      Graph based databases are enormously useful in applications that have connected data, such as social networks, routing infocenters, recommendation engine applications, spatial data and mapping applications and other applications requiring unique key relations.
      This gives greater flexibility in relational queries and also supports index free searches.


      They are extremely useful in analytic applications especially those which require predictions, recommendations, and consequence-analysis engines.
    So, we have some basic idea on the following:
    In the next sections, we'll be looking into a Document Data Stores and one of the popular implementation of this type, MongoDB.

    Prev     Next
    Palash Kanti Kundu

    No comments:

    Post a Comment