RSS

How to design table schema in HBase for various applications

04 Sep

This is what I am trying to solve during my Master study now.

My supervisor and I got a publication about how to store the large-scale time-series data in HBase. It proposes that the third dimension “Timestamps” in HBase which is used to control concurrency is designed to store one dimension of the datasets. We did the experiments with different schemas, which demonstrates that the different data schemas has different impact on performance (execution time of query).It also discusses how to design the row key, column and version dimension.

The interesting thing is that today, I just found a slide from Facebook, which mentioned about the different usage of the third dimension of HBase as well. Check it out if you are interested in it. http://qconlondon.com/dl/qcon-london-2011/slides/KannanMuthukkaruppan_HBaseFacebook.pdf

And now, I am heading on the location/space data, and investigating how to store the large-scale space data in HBase. And also investigate how to design the data schema in HBase for spatio-temporal datasets/applications.

I will update the result soon here ……

Advertisements
 
Leave a comment

Posted by on September 4, 2012 in Cloud Computing, HBase

 

Tags:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: