How to design table schema in HBase for various applications

04 Sep

This is what I am trying to solve during my Master study now.

My supervisor and I got a publication about how to store the large-scale time-series data in HBase. It proposes that the third dimension “Timestamps” in HBase which is used to control concurrency is designed to store one dimension of the datasets. We did the experiments with different schemas, which demonstrates that the different data schemas has different impact on performance (execution time of query).It also discusses how to design the row key, column and version dimension.

The interesting thing is that today, I just found a slide from Facebook, which mentioned about the different usage of the third dimension of HBase as well. Check it out if you are interested in it.

And now, I am heading on the location/space data, and investigating how to store the large-scale space data in HBase. And also investigate how to design the data schema in HBase for spatio-temporal datasets/applications.

I will update the result soon here ……

Leave a comment

Posted by on September 4, 2012 in Cloud Computing, HBase



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: