An Introduction to Apache HBase

Apache Hadoop HBase • What is it ? • Why
use it ? • Architecture • Storage • Related Projects

Hbase – What is it ? • A Hadoop Data
Store • A noSQL store for big data • It is Open Source, written in Java • It is a distributed database • Automatic sharding, table data spread over cluster • Automatic region server fail over

Hbase – Why / When use it ? • Data
in billions of rows • Complex data • High volume of I/O • High level of data nodes, 5 + • No need for extra RDBMS functions i.e. transactions

HBase – Architecture Where does Hbase sit in relation to
Hadoop ?

HBase – Architecture • HBase is a data store •
Uses Hadoop for distributed storage • Data stored across region servers • Region server data spread across HDFS data nodes • A write ahead log (WAL) is used to record changes

HBase – Storage • What is the architecture ?

HBase – Storage • Client makes call i.e. put •
Request RPC'ed as key value to Region server • Key Value routed to region for row • Data is written to WAL • Data written to region memStore • If region server cashes WAL can be used to recover data

HBase – Related Projects • Apache Flume – move large
data sets to Hadoop • Apache Sqoop – cmd line, move rdbms data to Hadoop • Apache Hbase – Non relational database • Apache Pig – analyse large data sets • Apache Oozie – work flow scheduler • Apache Mahout – machine learning and data mining • Apache Hue – Hadoop user interface • Apache Zoo Keeper – configuration / build

Contact Us • Feel free to contact us at –
www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems

An Introduction to Apache HBase

An Introduction to Apache HBase

Mike Frampton

More Decks by Mike Frampton

Other Decks in Technology

Featured

Transcript

Apache Hadoop HBase • What is it ? • Why

Hbase – What is it ? • A Hadoop Data

Hbase – Why / When use it ? • Data

HBase – Architecture Where does Hbase sit in relation to

HBase – Architecture • HBase is a data store •

HBase – Storage • What is the architecture ?

HBase – Storage • Client makes call i.e. put •

HBase – Related Projects • Apache Flume – move large

Contact Us • Feel free to contact us at –