Tuesday, September 2, 2014

Hbase: 'Put' Performance Tuning


When you have millions of data to insert into Hbase, use bulk method which increases Put speed.
Make sure to turn off AutoFlush option to False.


HTable tableTermMatrix = new HTable(conf,TABLE_Matrix);
tableTermMatrix.setAutoFlush(false);

Then inside your data insertion loop:

tableTermMatrix.put(put);
cnt++;
if(cnt>=5000)
{
cnt=0;
tableTermMatrix.flushCommits();
}

which will flush the data to RegionServer once for 5000 entries.


For me, after using this bulk method total insertion time into HBase changed from
60 minutes to 2.5 minutes.



No comments:

Post a Comment