Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An introduction to Cloudera Impala

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

An introduction to Cloudera Impala

An introduction to Cloudera Impala, what is it and
how does it work ? How can it bring real time
performance gains to Apache Hadoop ?

Avatar for Mike Frampton

Mike Frampton

August 14, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. Impala • What is it ? • How does it

    work ? • Performance • Formats • Architecture www.semtech-solutions.co.nz [email protected]
  2. Impala – What is it ? • Adhoc real time

    query for Hadoop • Open source • Developed by Cloudera • Based on Google 2010 dremel paper • Direct data access via Impala engine • Future Hadoop parquet update will – Add columnar binary storage to Hadoop – Improve Impala performance www.semtech-solutions.co.nz [email protected]
  3. Impala – How does it work ? • Direct data

    access • Query planning / coordination on data nodes • Node based query engine • Low latency • Perfomance imrovement • Query data on HDFS or Hbase • Uses same Hive QL syntax ( SQL like ) • Has the Hue GUI • Allows table joins and aggregation www.semtech-solutions.co.nz [email protected]
  4. Impala – Performance Impala delivers performance gains • IO bound

    queries – hardware limitations – Min 3 times • Complex – multiple MapReduce stages – Min 7 times • Cached queries – Min 20 times www.semtech-solutions.co.nz [email protected]
  5. Impala – Formats Supported formats – Text & Sequence Files

    which can be compressed as • Snappy • GZIP • BZIP – Future support for • Avro • RCFile • LZO text file • Parquet www.semtech-solutions.co.nz [email protected]
  6. Impala – Requirements What does Impala need to run ?

    – CentOS 6.2 – or RHEL (Red Hat Enterprise Linux) – CDH 4.1 (Cloudera Hadoop Distribution) – Cloudera Manager ( advised ) www.semtech-solutions.co.nz [email protected]
  7. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems