Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An introduction to the Apache Hadoop command set

An introduction to the Apache Hadoop command set

An introduction to Apache Apache Hadoop command set.
What commands are available and what do they do ? A
brief introduction to each command without indepth
detail.

Mike Frampton

August 17, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. Apache Command Set • What types ? • What are

    they ? • What do they do ? • Environment • Configuration www.semtech-solutions.co.nz [email protected]
  2. Hadoop commands – What types ? • User commands •

    Administration commands • Generic options for all commands • Configuration options • Environment – Variables i.e. HADOOP_PREFIX – Aliases i.e. hls = hadoop fs -ls www.semtech-solutions.co.nz [email protected]
  3. Hadoop commands – What are they ? User Commands •

    archive – save files to a har archive • distcp – copy files or directories recursively • fs – file system commands – cat – copies file to stdout – chgrp – change group associated with file – chmod – change file permissions – chown – change file ownership – CopyFromLocal – copy from local file reference – CopyToLocal – copy to local file reference www.semtech-solutions.co.nz [email protected]
  4. Hadoop commands – What are they ? User Commands •

    fs – file system commands – count – count of dir / files/ bytes – cp – copy files – du – size of files and directories – dus – display file lengths – expunge – empty trash – get – copy files to local file system – getmerge – get but merge files – ls – file listing – lsr recursive ls www.semtech-solutions.co.nz [email protected]
  5. Hadoop commands – What are they ? User Commands •

    fs – file system commands – mkdir – make directory – moveFromLocal – put with delete of origin – mv – move from source to destination – put – copy between file systems – rm – remove a file – rmr – recursive delete – setrep – change file replication factor – stat – returns file stat information www.semtech-solutions.co.nz [email protected]
  6. Hadoop commands – What are they ? User Commands •

    fs – file system commands – tail – display end of file – test – check file existence / type – text – output file as text – touchz – create zero length file • fsck – HDFS file system check • fetchdt – get delegation token from name node • jar – run jar file • Job – manage mapreduce jobs www.semtech-solutions.co.nz [email protected]
  7. Hadoop commands – What are they ? User Commands •

    pipes – run a pipe job • queue – interact and view job queue • version – get Hadoop version • CLASSNAME – run class named CLASSNAME • classpath – print the class path www.semtech-solutions.co.nz [email protected]
  8. Hadoop commands – What are they ? Administration Commands •

    balancer – run cluster balancing • daemonlog – get/set daemon log level • datanode – run hdfs data node • dfsadmin – run dfsadmin client • mradmin – run map reduce admin client • jobtracker – run mr jobtracker node • namenode – runs the name node • secondarynamenode – run secondary name node • tasktracker – run task tracker node www.semtech-solutions.co.nz [email protected]
  9. Hadoop Environment See the .bashrc for environment set up ##export

    HADOOP_HOME=/usr/local/hadoop ## deprecated export HADOOP_PREFIX=/usr/local/hadoop export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386 unalias hfs &> /dev/null alias hfs="hadoop fs" unalias hls &> /dev/null ; alias hls="hfs -ls" unalias hup1 &> /dev/null ; alias hup1="cd $HADOOP_PREFIX/bin ; ./start-dfs.sh" unalias hup2 &> /dev/null ; alias hup2="cd $HADOOP_PREFIX/bin ; ./start-mapred.sh" unalias hdwn1 &> /dev/null ; alias hdwn1="cd $HADOOP_PREFIX/bin ; ./stop-mapred.sh" unalias hdwn2 &> /dev/null ; alias hdwn2="cd $HADOOP_PREFIX/bin ; ./stop-dfs.sh" # if using LZO compression then add entry here for viewing # LZO compressed files ##PATH=$PATH:$HADOOP_HOME/bin ## deprecated PATH=$PATH:$HADOOP_PREFIX/bin PATH=$PATH:$JAVA_HOME/bin export PATH www.semtech-solutions.co.nz [email protected]
  10. Hadoop Configuration • Configuration files under $HADOOP_PREFIX/conf • Initial set

    up in – core-site.xml – hdfs-site.xml – mapred-site.xml • Example from core-site.xml <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop/tmp</value> <description>A base for other temporary directories.</description> </property> www.semtech-solutions.co.nz [email protected]
  11. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems