Slide 26
Slide 26 text
HBase Deployment
● All access is via a service that provides a restricted API
● Ensure no long running queries, deal with timeouts everywhere, ...
● Tune settings to work with a lot of data per node
● Set block size and compression for each Column Family
● Do not use block cache for large scans (Scan.setCacheBlocks) and
‘batch’ every time you touch fat columns
● Scripts to manage regions (balancing, merging, bulk delete)
● We host on dedicated servers
● Data replicated to backup clusters, where we run analytics