This is what I am trying to solve during my Master study now.
My supervisor and I got a publication about how to store the large-scale time-series data in HBase. It proposes that the third dimension “Timestamps” in HBase which is used to control concurrency is designed to store one dimension of the datasets. We did the experiments with different schemas, which demonstrates that the different data schemas has different impact on performance (execution time of query).It also discusses how to design the row key, column and version dimension.
The interesting thing is that today, I just found a slide from Facebook, which mentioned about the different usage of the third dimension of HBase as well. Check it out if you are interested in it. http://qconlondon.com/dl/qcon-london-2011/slides/KannanMuthukkaruppan_HBaseFacebook.pdf
And now, I am heading on the location/space data, and investigating how to store the large-scale space data in HBase. And also investigate how to design the data schema in HBase for spatio-temporal datasets/applications.
I will update the result soon here ……