Thursday, November 16, 2017

Couchbase: Bucket (Database) Architecture

Introduction

In Couchbase, a bucket is similar to a database in other DB systems. Application inserts data directly into a bucket(s) and there is no any other schema or objects inside it. A bucket contains documents.

Wednesday, November 8, 2017

Rack awareness feature in Couchbase

In distributed database systems, data is distributed across many nodes. If you consider Cassandra, it is not uncommon to have a cluster (in Cassandra terms, a ring) with 1000 nodes or even more. These nodes are then grouped into different racks, in cloud terms, different availability zones. The reason is, in the event that a whole rack (availability zone) goes down since the replica partitions are on separate racks, data will remain available. 

When it comes to Couchbase, the so-called rack awareness feature is controlled by using Groups. You can assign Couchbase servers into different Groups to achieve the rack awareness capability. 

If you are provisioning a Couchbase cluster on AWS, you can create the server Groups analogous to the availability zones on AWS. This logical grouping in Couchbase allows administrators to specify that active and replica partitions be created on servers that are part of a separate rack zone. 

See below figure-1 and notice the Couchbase cluster deployed on AWS has two server groups similar to the availability zone. 

Figure-1 - Multi-dimensional Cochbase cluster deployed on AWS

This cluster has two nodes for each service offering for data, index, and query. The servers are logically grouped into two groups, rack-1a and rack-1b which is similar to availability zone 1a and 1b on AWS respectively. As a result, servers are physically arranged in two racks. 

It is recommended to have the number of servers same between the server groups. If there is an unequal number of servers in one server group, the rebalance operation performs the best effort to evenly distribute replica vBuckets across the cluster. 

The rack awareness feature is available only in Enterprise Edition of Couchbase. 

Cheers!

Friday, October 20, 2017

Couchbase - the engagement database system

Today, I just wrapped up the four-day Couchbase administration course from Couchbase. Since I'm still new to this NoSQL world, it is tons of new stuff. Couchbase refers themselves as beyond just a NoSQL database, they call it engagement database system. 

Couchbase also has SQL style query language which they call it as N1QL. Using N1QL you can query the JSON documents stored in Couchbase buckets. 

What is so special about Couchbase with opposed to the other NoSQL databases like Cassandra and MongoDB is, they provide multi-dimensional scalability which is essentially, different database components like data service, query service, index service and full-text services can be scaled in/out independently. This concept makes distributed system even more complex in my opinion. The main advantage of multi-dimensional scaling is, you can select different hardware resources for each service to best suit its workload which is a very good thing. 

Couchbase provides very good free online courses if you want to get familiar with the database system.


So next few months I'll be working on Couchbase closely with automation on AWS platform. 

Cheers!

Thursday, September 21, 2017

Creating test data for Apache Cassandra cluster

In many cases you are required to generate test data for Cassandra to do various type of testing. Read the rest of the blogpost if this is something you need and at the end, you should be able to create some test data very quickly.

I use the Ubuntu 16.04.2 LTS and Apache Cassandra 3.0 for this testing. The Cassandra cluster that I'm using here is deployed on AWS but this should not be a consideration factor for test data creation.

Use the steps mentioned below to create the test data.

    1. Download the csv file

curl -O http://samplecsvs.s3.amazonaws.com/Sacramentorealestatetransactions.csv

Note. You can download this file directly on to the EC2 but due to some formatting issue it did not work correctly for me. I downloaded the file to my local machine (Mac) first and then opened the file using TextEditor then copied the contents to the EC2.

    2. Create the csv file on EC2

I just used the vi editor to open a new file called, realstatesdata.csv and then pasted the file contents (including headers) that I copied in step #1. Save the file. (You should be familiar with the VI editor to perform this step.)

After completing this step, you now have "realstatesdata.csv" on EC2.

    3. Connect to the Cassandra cluster using cqlsh



    4. Create a Keyspace

You need to create a Keyspace (in general terms, a database) which is a high level hierarchical object to contain Cassandra tables. The keyspace name I've chosen is "kp_realstate" but feel free to have any name which you want. 



    5. Create a table

You also need to have a table which is the actual object which contains your real data. Unlike MongoDB, you need to have a the table schema created before you insert any data. I already analyzed the data set in csv file we just downloaded and decided the columns based on that. The table name is "realestate_data" again its your choice.



Note that, the order of the columns in CREATE TABLE statement should be the same as the order it appears in the csv file.

    6.  Load the data

You use the COPY command to load data from a file to Cassandra.

COPY kp_realstate.realestate_data (street,city,zip,state,beds,baths,sq__ft,type,sale_date,price,latitude,longitude) FROM 'realstatesdata.csv' WITH HEADER = TRUE;

It has to be executed at CQL prompt.

If the import is successful you will see the messages like below.

Processed: 985 rows; Rate:    1151 rows/s; Avg. rate:    1865 rows/s
985 rows imported from 1 files in 0.528 seconds (0 skipped).

   7. Make sure the data is imported successfully

At CQL prompt, you can execute any of the statements below.

select count(*) from kp_realstate.realestate_data;
select * from kp_realstate.realestate_data limit 30;

At the end, you've full data set for your testing.

Cheers!

Wednesday, September 13, 2017

JEMalloc and Cassandra

Memory management in Cassandra

Cassandra depends on JVM-Java Virtual Machine, to accomplish Cassandra's memory management requirement. The JVM mainly divided into two areas as follows;
  1. Heap - data area which contains the runtime structures. 
  2. Internal data structures - Java methods, thread stack and native methods. 

Cassandra uses its memory in four ways which are mentioned below. This includes OS memory too.
  1. Java heap
  2. Offheap memory (OS memory that is not managed by JVM G.C-Garbage Collector)
  3. OS page cache
  4. OS TCP/IP stack I/O cache

Since Cassandra uses JVM for its memory management, tuning of JVM is necessary to get optimal performance in Cassandra. The tuning of JVM includes the changing the settings in cassandra-env.sh as mentioned below;
  • MAX_HEAP_SIZE
  • HEAP_NEWSIZE

What is JEMalloc?

JEMalloc is an enhanced memory allocator in Linux based platforms. With JEMalloc, the memory allocation for multithreaded applications scales well as the no.of processors' increases. The previously used memory allocator, malloc(3) suffered scalability bottleneck for some multithreaded applications that caused JEMalloc to emerged. 

Use of JEMalloc has been introduced in Cassandra after 2.0. 

Ensure JNA-Java Native Access and JEMalloc are installed on Linux AMI. If you're creating an Amazon AMI for Cassandra, then you want to install both of these. 

yum install -y jna
yum install -y jemalloc

Cassandra.yaml configuration requires the change mentioned below in order to use JEMalloc. 

memtable_allocation_type: offheap_objects

Note. The above setting is set to "heap_buffers" by default. 

What is the benefit of using JEMalloc in Cassandra

By enabling JEMalloc in Cassandra, it reduces the amount of Java heap space that Cassandra uses. Data written to Cassandra is first stored in memtables in heap memory. Memtables are then flushed to SStables on disk when they get full. The garbage collection process of JVM is used to clear the heap memory. Sometimes, this garbage collection process causes issues in Cassandra due to garbage collection pause. 

The benefit of JEMalloc is, it reduces the pressure of garbage collection because Cassandra uses off-heap memory allocation with JEMalloc. 

References


Wednesday, April 26, 2017

Cloud Spanner - Google's mission critical and massively scale RDBMS with NoSQL features

Isn't this amazing?
A database system which blends most important relational components and highly scalable NoSQL capabilities in one system. Google call it, Cloud Spanner

As per Google, it can scale massively to hundreds of data centers around the globe. Most importantly it can speak SQL which is the strongest database language so far. 

In a nutshell, Cloud Spanner has the following features. 
  • Data consistency
  • Scales horizontally across data centers
  • Speaks SQL 
  • Strong consistency and availability at global scale
  • Massively distributed
  • ACID compliance
  • Automatic sharding and synchronous replication with low latency
  • Schema updates without downtime
  • Auto-scales up and down as needed
  • Simple pricing model  
This proves the point that eventually both relational and non-relational databases will merge!

Cheers!

Friday, October 14, 2016

Vertical scaling vs Horizontal scaling: Which one is more challenging?

I think this topic is more appropriate when it comes to deciding the database system for your application. Vertical scaling refers to scale the database system by providing more hardware resources to the host machine. In the other hand, horizontal scaling refers to scale out the database systems by adding more nodes. Either way the objective is to have better performing database system so that it can serve users fast. 

I opened up a discussion with my fellow DBAs and DB enthusiasts and below are some of the views they expressed about the topic. 

"I think Horizontal scaling (by adding node/machines) is less challenging. Any way in vertical scaling you will hit a ceiling with CPU memory etc depending on the capacity of the machines used." [Bhanu]

"Vertical scaling is less challenging to implement as it is just to increase the resources of a given machine.in a virtual environment VS becomes more and more easier to implement. However, as Bhanu pointed out, there is a limit where u can reach with VS.
Horizontal Scaling will require more expertise (hence costly), additional software and in some cases, even additional hardware features.  Therefore, HS will be more challenging to implement" [Shamil]

"Horizontal scaling is much more align with NoSQL Technologies where as in Relational aren't.
Relational does this through partitioning and there should be some routing mechanism implemented in some where to cater.
My guess is, it is based on the technologies we are trying to implement." [Abi]

"I would say, irrespective of the technology still Vertical Scaling is less challenging to implement. for example, adding more RAM and processor power to a mongoDB box is much easier than implementing mongo sharding.

In the other hand, even relational DB technologies support horizontal scaling. SQL server has AlwaysOn and Oracle has Real Application Clusters. both can do Horizontal Scaling." [Shamil]

My view

Regardless of the DB system, the core concept is to process small chunks of data to achieve better performance. If you consider SQL Server we always try to deal with small chunks of data, when scanning, deleting, updating, inserting, etc. That is because no matter you increase the processing power, the underneath algorithms are not salable. As a result you're forced to deal with small chunks of data to achieve better performance. By adding multi-cores does not resolve this mathematical limitation of algorithms. Then there is additional overhead when it comes to threads management. 

In relational db systems, they are designed to handle large monolith databases in a single server. That is the reason we've many TBs of databases in relational systems and we're facing many limitations and challenges with those databases. 

When it comes to SQL Server, it has some horizontal scaling solutions like table partitioning and replicas. If you consider table partitioning, it is still within a single machine and you can scale out IO operations within a single machine. (There is an exception to this with the newly released version of SQL Server 2016. A feature called stretched database allows you to partition a table to keep frequently working data set in on-prem and less frequently used data set in cloud.) You can not distribute partitions among many nodes. So still it has a limitation. 

If you consider replicas in SQL Server, those can be distributed in many nodes. However replicas are read only so that you can scale out applications reads but still writes are not scalable. In AlwaysOn, writes can be accepted by the primary node only. Consequently SQL Server does not provides true horizontal scaling capability just because of the limitations that I've mentioned above. 

In true horizontal scaling, data should be able to partitions and assign to different nodes/machines. This technique is called sharding. Database systems like, MongoDB, Cassandra and DynamoDB all provides this capability. In these database systems, the data can be partitioned and also these partitions can be placed in many nodes which gives the horizontal scaling capability backed by additional CPU and memory power. In other words, CPU and memory can be scale out to other nodes.

The specialty of Cassandra and DynamoDB is, they have masterless architecture which means any nodes of the cluster can accepts reads as well as writes. So there is no single hot spot when it comes to writes as in SQL Server. However, in this architecture has numerous challenges since data is fully distributed among many nodes by using some kind of hashing technique. You can imagine how complex is to handle the concurrency related issues when the your data is distributed and the complexity becomes increase when any nodes can accept the write. These challenges are simply because those DB systems are designed for horizontal scaling. 

So in my opinion horizontal scaling is more challenging than vertical scaling. Actually in vertical scaling there is no distinctive challenge.

Hope this is interesting discussion. 

Cheers!