Tuesday, September 2, 2014

Hbase: 'Put' Performance Tuning


When you have millions of data to insert into Hbase, use bulk method which increases Put speed.
Make sure to turn off AutoFlush option to False.


HTable tableTermMatrix = new HTable(conf,TABLE_Matrix);
tableTermMatrix.setAutoFlush(false);

Then inside your data insertion loop:

tableTermMatrix.put(put);
cnt++;
if(cnt>=5000)
{
cnt=0;
tableTermMatrix.flushCommits();
}

which will flush the data to RegionServer once for 5000 entries.


For me, after using this bulk method total insertion time into HBase changed from
60 minutes to 2.5 minutes.