Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An example Apache Hadoop Yarn upgrade

An example Apache Hadoop Yarn upgrade

This is a simple example of how Hadoop on
Ubuntu Linux can be upgraded from V1 to Yarn.
It shows the steps, the configuration, a
mapreduce check and the errors encountered.

Mike Frampton

August 31, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. Apache Yarn Upgrade • Example upgrade • From V1 ->

    Yarn • Environment • Approach • Install steps • Install check www.semtech-solutions.co.nz [email protected]
  2. Yarn Upgrade Environment • Java OpenJDK 1.6.0_27 • Ubuntu 12.04

    • Maven 3.0.4 • Hadoop 1.2.0 • Mahout 0.9 • Hadoop to install – 2.0.6-alpha Full details are available from our web site site under guides folder www.semtech-solutions.co.nz [email protected]
  3. Yarn Upgrade Approach • Install along side existing Hadoop on

    all nodes • Use existing hdfs • Change cfg files on all nodes • Set up as single nodes and test via mapreduce • Create cluster and test via mapreduce • Check web GUI access Full details are available from our web site site under guides folder www.semtech-solutions.co.nz [email protected]
  4. Yarn Upgrade Install • Build with Maven into a distribution

    directory mvn clean package -Pdist -Dtar -DskipTests -Pnative release created under ./hadoop-dist/target/hadoop-2.0.6-alpha • Only skip tests after first build to speed things up • Configure $HOME/.bashrc – HADOOP_COMMON_HOME – HADOOP_HDFS_HOME – HADOOP_MAPRED_HOME – HADOOP_YARN_HOME – HADOOP_CONF_DIR – YARN_CONF_DIR – MAPRED_CONF_DIR – HADOOP_PREFIX – PATH – YARN_CLASSPATH www.semtech-solutions.co.nz [email protected]
  5. Yarn Upgrade Install • Set up core-site.xml cd $HADOOP_COMMON_HOME/etc/hadoop •

    Alter values for – fs.default.name – hadoop.tmp.dir – fs.checkpoint.dir www.semtech-solutions.co.nz [email protected]
  6. Yarn Upgrade Install • Set up hdfs-site.xml cd $HADOOP_HDFS_HOME/etc/hadoop •

    Alter values for – dfs.name.dir – dfs.data.dir – dfs.http.address – dfs.secondary.http.address – dfs.https.address www.semtech-solutions.co.nz [email protected]
  7. Yarn Upgrade Install • Set up yarn-site.xml cd $YARN_CONF_DIR •

    Alter values for – yarn.resourcemanager.resource-tracker.address – yarn.resourcemanager.scheduler.address – yarn.resourcemanager.scheduler.class – yarn.resourcemanager.address – yarn.nodemanager.local-dirs – yarn.nodemanager.address – yarn.nodemanager.resource.memory-mb – yarn.nodemanager.remote-app-log-dir – yarn.nodemanager.log-dirs – yarn.nodemanager.aux-services – yarn.web-proxy.address www.semtech-solutions.co.nz [email protected]
  8. Yarn Upgrade Install • Set up mapred-site.xml cd $MAPRED_CONF_DIR •

    Alter values for – mapreduce.cluster.temp.dir – mapreduce.cluster.local.dir – mapreduce.jobhistory.address – mapreduce.jobhistory.webapp.address www.semtech-solutions.co.nz [email protected]
  9. Yarn Upgrade Install • Set up capcity-scheduler.xml cd $HADOOP_YARN_HOME/etc/hadoop •

    Alter values for – yarn.scheduler.capacity.maximum-applications – yarn.scheduler.capacity.maximum-am-resource-percent – yarn.scheduler.capacity.resource-calculator – yarn.scheduler.capacity.root.queues – yarn.scheduler.capacity.child.queues – yarn.scheduler.capacity.child.unfunded.capacity – yarn.scheduler.capacity.child.default.capacity – yarn.scheduler.capacity.root.capacity – yarn.scheduler.capacity.root.unfunded.capacity – yarn.scheduler.capacity.root.default.capacity – yarn.scheduler.capacity.root.default.user-limit-factor – yarn.scheduler.capacity.root.default.maximum-capacity – yarn.scheduler.capacity.root.default.state – yarn.scheduler.capacity.root.default.acl_submit_applications – yarn.scheduler.capacity.root.default.acl_administer_queue – yarn.scheduler.capacity.node-locality-delay www.semtech-solutions.co.nz [email protected]
  10. Yarn Upgrade Install • Start Resource Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh

    start resourcemanager • Start Node Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh start ndemanager • Test via map reduce job cd $HADOOP_MAPRED_HOME/share/hadoop/mapreduce $HADOOP_COMMON_HOME/bin/hadoop jar \ hadoop-mapreduce-examples-2.0.6-alpha.jar randomwriter out www.semtech-solutions.co.nz [email protected]
  11. Yarn Upgrade Install • Mapreduce job should end with BYTES_WRITTEN=1073750341

    RECORDS_WRITTEN=102099 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=1085699265 Job ended: Sun Aug 25 12:45:35 NZST 2013 The job took 89 seconds. • Run this test on each node being upgraded www.semtech-solutions.co.nz [email protected]
  12. Yarn Upgrade Install • Stop the servers cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh

    stop resourcemanager stopping resourcemanager sbin/yarn-daemon.sh stop nodemanager stopping nodemanager • Alter Hadoop env cd $HADOOP_CONF_DIR vi hadoop-env.sh add a JAVA_HOME definition at the end. i.e. export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386 www.semtech-solutions.co.nz [email protected]
  13. Yarn Upgrade Install • Alter $HADOOP_CONF_DIR/slaves file – Add details

    ( one per line ) for slave nodes • Format the cluster – DONT have the cluster running else you will lose data – hdfs namenode -format • Now proceed to start the cluster www.semtech-solutions.co.nz [email protected]
  14. Yarn Upgrade Install cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_COMMON_HOME/etc/hadoop --script hdfs

    start namenode cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager cd $HADOOP_YARN_HOME bin/yarn start proxyserver --config $HADOOP_CONF_DIR cd $HADOOP_MAPRED_HOME sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR www.semtech-solutions.co.nz [email protected]
  15. Yarn Upgrade Install • Use jps to check servers running

    jps 5856 DataNode 6434 Jps 5776 NameNode 6181 NodeManager 6255 WebAppProxyServer 5927 ResourceManager 6352 JobHistoryServer • Then run the same mapreduce job on the cluster www.semtech-solutions.co.nz [email protected]
  16. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems