Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An introduction to Apache Sqoop

An introduction to Apache Sqoop

An introduction to Apache Sqoop, what is it ?
How does it assist in large volume data transfer
between Hadoop and external sources ?

Mike Frampton

August 21, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. Apache Sqoop • What is it ? • How does

    it work ? • Interfaces • Example • Architecture www.semtech-solutions.co.nz [email protected]
  2. Scoop – What is it ? • A command line

    interface – ( plus web in scoop2 ) • For data import / export to Hadoop • Uses Map jobs from Map Reduce • Supports incremental loads • Written in Java • Licensed by Apache • Uses plugins for new types of data source www.semtech-solutions.co.nz [email protected]
  3. Scoop – How does it work ? • Data sliced

    into partitions • Mappers transfer data • Data types determined via meta data • Many data transfer formats supported – i.e. CSV, Avro • Can import into – Hive ( use --hive-import flag ) – Hbase ( use –hbase* flags ) www.semtech-solutions.co.nz [email protected]
  4. Scoop – Interfaces • Get data from – Relational databases

    – Data warehouses – NoSQL databases • Load to Hive and Hbase • Integrates with Oozie – for scheduling www.semtech-solutions.co.nz [email protected]
  5. Scoop – Example An example scoop command to – load

    data from mySql into Hive bin/sqoop-import --connect jdbc:mysql://<mysql host>:<msql port>/db3 \ -username <username> \ -password <password> \ --table <tableName> \ --hive-table <Hive tableName> \ --create-hive-table \ --hive-import \ --hive-home <hive path> www.semtech-solutions.co.nz [email protected]
  6. Scoop – Architecture Scoop has moved from • Scoop1 to

    Scoop 2 • Changed from client to server install • Now has web and command line access • Server now accesses Hive & Hbase • Oozie uses REST API www.semtech-solutions.co.nz [email protected]
  7. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems