Automating the Hadoop Stack with Chef

Automating the Hadoop Stack A view into Shopzilla’s automation at
scale Thursday, April 18, 2013

Agenda Shopzilla - who are we? Hadoop Primer Hadoop at
Shopzilla What challenges do we face What are we doing to solve them Details, details, details Thursday, April 18, 2013

Chris Hemphill Director Systems Engineering Thursday, April 18, 2013

Roman Gazaryants Hadoop Administrator Thursday, April 18, 2013

Shopzilla, Inc. is a leading source for connecting buyers and
sellers online. Global audience of over 40 million shoppers each month 100 million products from tens of thousands of retailers. Thursday, April 18, 2013

Thursday, April 18, 2013

Hadoop The Apache™ Hadoop® project develops open-source software for reliable,
scalable, distributed computing. Modules: Common, HDFS, YARN, MapReduce NameNode Secondary NameNode Job Tracker Data Nodes with TaskTracker Thursday, April 18, 2013

Add Ons Ambari™: A web-based tool for provisioning, managing, and
monitoring Apache Hadoop clusters Avro™: A data serialization system. Cassandra™: A scalable multi-master database with no single points of failure. Chukwa™: A data collection system for managing large distributed systems. HBase™: A scalable, distributed database that supports structured data storage for large tables. Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying. Mahout™: A Scalable machine learning and data mining library. Pig™: A high-level data-ﬂow language and execution framework for parallel computation. ZooKeeper™: A high-performance coordination service for distributed applications. Thursday, April 18, 2013

Cloudera Thursday, April 18, 2013

Cloudera Distribution 5 Clusters 2 Production 2 Staging/QA 1 POC
129 nodes - 16 nodes Thursday, April 18, 2013

CHALLENGES Consistency Scaling quickly Scaling reliably Mixed Hardware Rollback Thursday,
April 18, 2013

Asset Management Hostname consistency Kickstart/Pre-seed Chef Cloudera Hadoop API SOLUTIONS

Install base OS Install packages Conﬁgure the cluster Thursday, April
18, 2013

Preseed Chef API Install base OS Install packages Conﬁgure the
cluster Thursday, April 18, 2013

Hostnames hadooppocnndev001 hadooppocdndev021 hadooppocjtdev001 hadooppocbastiondev001 Thursday, April 18, 2013

Hostnames hadoop poc nn dev 001 hadoop poc dn dev
021 hadoop poc jt dev 001 hadoop poc bastion dev 001 Cluster name Node role Environment Node # Thursday, April 18, 2013

Build Flow 00:1B:21:2C:C0:E8 PXE dhcp Thursday, April 18, 2013

Build Flow - Preﬂight PXE dhcp Asset Tracker Thursday, April
18, 2013

Build Flow - Preﬂight Does 00:1B:21:2C:C0:E8 belong to a known
asset? Is “kickstart” enabled? What is the hostname of the asset? What OS and version is the host set to use? Get the IP address via DNS. Get the asset ID. PXE dhcp Asset Tracker Thursday, April 18, 2013

Build Flow - PXE PXE dhcp Asset Tracker ✓ PXE
boot options default Ubuntu12.04-x86_64.vmlinuz append console-setup/layout=us preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw console=ttyS1,19200n8 \ locale=en_US text hostname=x netcfg/dhcp_timeout=120 initrd=Ubuntu12.04-x86_64.initrd.img BOOTIF=01-84-8F-69-FE-F6-5C \ auto=true interface=auto serial nofb Thursday, April 18, 2013

Build Flow - PXE PXE dhcp Asset Tracker ✓ PXE
boot options default Ubuntu12.04-x86_64.vmlinuz append console-setup/layout=us preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw console=ttyS1,19200n8 \ locale=en_US text hostname=x netcfg/dhcp_timeout=120 initrd=Ubuntu12.04-x86_64.initrd.img BOOTIF=01-84-8F-69-FE-F6-5C \ auto=true interface=auto serial nofb 00:1B:21:2C:C0:E8 ✓ ✓ Thursday, April 18, 2013

Build Flow - Preseed 00:1B:21:2C:C0:E8 default Ubuntu12.04-x86_64.vmlinuz append console-setup/layout=us preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw
console=ttyS1,19200n8 \ locale=en_US text hostname=x netcfg/dhcp_timeout=120 initrd=Ubuntu12.04-x86_64.initrd.img BOOTIF=01-84-8F-69-FE-F6-5C \ auto=true interface=auto serial nofb preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw Thursday, April 18, 2013

Build Flow - Preseed 00:1B:21:2C:C0:E8 preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw Asset Tracker ‣Perl +
Template module to build preseed ﬁle ‣Takes the asset ID as the argument ‣Collect further data from the asset tracker Thursday, April 18, 2013

Build Flow - Preseed 00:1B:21:2C:C0:E8 preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw [% IF KickstartEnable AND
KickstartEnable != 'Diskless'; %] [% IF OSName AND OSName == 'Ubuntu' AND OSVersion AND (matches = OSVersion.match('12.04')); %] [% IF Model AND Model == 'PowerEdge C6220' AND Name AND (matches = Name.match("hadoop.*(nn|jt).*.sea$")); %] d-i partman-auto/disk string /dev/sda d-i partman-auto/method string regular d-i partman-auto/purge_lvm_from_device boolean true d-i partman-auto/expert_recipe string \ boot-root :: \ 1 1 1 free \ method{ biosgrub } \ . \ 40 50 100 ext3 \ $primary{ } $bootable{ } \ method{ format } format{ } \ use_filesystem{ } \ filesystem{ ext3 } \ mountpoint{ /boot } \ . \ 8000 70 8000 linux-swap \ method{ swap } format{ } \ . \ 500 10000 1000000000 ext4 \ method{ format } format{ } \ use_filesystem{ } \ filesystem{ ext4 } \ mountpoint{ / } \ . Asset Tracker Thursday, April 18, 2013

[% ELSIF Model AND Model == 'PowerEdge C6220' AND Name
AND ((matches = Name.match("hadoop.*dn.*.sea$"))); %] d-i partman-auto/disk string /dev/sda d-i partman-auto/method string regular d-i partman-auto/purge_lvm_from_device boolean true d-i partman-auto/expert_recipe string \ boot-root :: \ 1 1 1 free \ method{ biosgrub } \ . \ 40 50 100 ext3 \ $primary{ } $bootable{ } \ method{ format } \ format{ } \ use_filesystem{ } \ filesystem{ ext3 } \ mountpoint{ /boot } \ . \ 25600 60 51200 ext4 \ method{ format } \ format{ } \ use_filesystem{ } \ filesystem{ ext4 } \ mountpoint{ / } \ . \ 8000 70 8000 linux-swap \ method{ swap } format{ } \ . \ 500 10000 1000000000 ext4 \ method{ format } format{ } \ use_filesystem{ } \ filesystem{ ext4 } \ mountpoint{ /data/1 } \ options/noatime{ noatime } \ . \ Build Flow - Preseed 00:1B:21:2C:C0:E8 preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw Asset Tracker Thursday, April 18, 2013

Build Flow - Preseed 00:1B:21:2C:C0:E8 preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw Asset Tracker Limitation -
can only manage one volume. #!/usr/bin/env bash #Yes this is really dumb and sucks, but you can't do multiple disks that aren't in an lvm according to all the docs we could find. HOSTNAME=`hostname --fqdn` if echo $HOSTNAME | egrep 'hadoop.*dn.*.sea$' || echo $HOSTNAME | egrep 'hadoop.*dev.*.sea'; then USEDDEV=`mount | grep '/boot' | awk '{gsub('/[0-9]/',"");print $1}'` ALLDEV=`ls /dev/sd* | grep "sd[a-z]$"` DEVCNT=ècho "$ALLDEV" | wc -l` DATACNT=2 for sd in ècho "$ALLDEV" | grep -v $USEDDEV`; do echo ';' | sfdisk "$sd" sleep 2 mkfs.ext4 -m1 -O dir_index,extent,sparse_super "$sd"1 echo "${sd}1 /data/$DATACNT ext4 noatime 0 2" >> /etc/fstab # mount each /dev/sd* to /data/# mkdir -p /data/$DATACNT DATACNT=èxpr $DATACNT + 1` done fi Thursday, April 18, 2013

chef chef/chef_server_url string http://chef.shopzilla.com:4000/ d-i preseed/late_command string in-target wget -q
http://ks.shopzilla.laxhq/ksdone.cgi?id=[% id %];\ in-target sh -c '/usr/bin/curl http://ks.shopzilla.laxhq/chef/chef-configure.sh | sh';\ in-target sh -c '/usr/bin/curl http://ks.shopzilla.laxhq/chef/hadoop-disks.sh | sh' Build Flow - Preseed 00:1B:21:2C:C0:E8 preseed/url=http://ks.shopzilla.laxhq/ubgen.cgi?id=28353&type=hw Asset Tracker #!/usr/bin/env bash #Yes this is really dumb and sucks, but you can't do multiple disks that aren't in an lvm according to all the docs we could find. HOSTNAME=`hostname --fqdn` if echo $HOSTNAME | egrep 'hadoop.*dn.*.sea$' || echo $HOSTNAME | egrep 'hadoop.*dev.*.sea'; then USEDDEV=`mount | grep '/boot' | awk '{gsub('/[0-9]/',"");print $1}'` ALLDEV=`ls /dev/sd* | grep "sd[a-z]$"` DEVCNT=ècho "$ALLDEV" | wc -l` DATACNT=2 for sd in ècho "$ALLDEV" | grep -v $USEDDEV`; do echo ';' | sfdisk "$sd" sleep 2 mkfs.ext4 -m1 -O dir_index,extent,sparse_super "$sd"1 echo "${sd}1 /data/$DATACNT ext4 noatime 0 2" >> /etc/fstab # mount each /dev/sd* to /data/# mkdir -p /data/$DATACNT DATACNT=èxpr $DATACNT + 1` done fi Thursday, April 18, 2013

Build Flow - Preseed 00:1B:21:2C:C0:E8 '/usr/bin/curl http://ks.shopzilla.laxhq/chef/chef-configure.sh | sh' #!
/bin/sh /etc/init.d/chef-client stop echo > /var/log/chef/client.log set -e mkdir -p /etc/chef /root/.chef_server_url cat > /etc/chef/validation.pem <<EOF -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- EOF cat > /root/.chef/kickstart.pem <<EOF -----BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- EOF cat > /root/.chef/knife.rb <<EOF log_level :info log_location STDOUT node_name 'kickstart' client_key '/root/.chef/kickstart.pem' validation_client_name 'chef-validator' validation_key '/etc/chef/validation.pem' chef_server_url 'http://chef.shopzilla.com:4000/' cache_type 'BasicFile' cache_options( :path => '/root/.chef/checksums' ) EOF Install certiﬁcates and conﬁgure the client Thursday, April 18, 2013

Build Flow - Preseed 00:1B:21:2C:C0:E8 '/usr/bin/curl http://ks.shopzilla.laxhq/chef/chef-configure.sh | sh' ENV=prod
if echo $HOSTNAME | grep -Eq 'stage[0-9]{3}' then ENV=stage fi echo "node_name '$HOSTNAME'" >> /etc/chef/client.rb cat > /etc/chef/node.json <<EOF { "name": "$HOSTNAME", "chef_environment": "$ENV", "json_class": "Chef::Node", "automatic": { }, "normal": { }, "chef_type": "node", "default": { }, "override": { }, "run_list": [ "role[base]" ] } EOF knife client delete $HOSTNAME -y --config /root/.chef/knife.rb --key /root/.chef/kickstart.pem || true knife node delete $HOSTNAME -y --config /root/.chef/knife.rb --key /root/.chef/kickstart.pem || true knife node from file /etc/chef/node.json --config /root/.chef/knife.rb --key /root/.chef/kickstart.pem exit 0 Set the node’s chef environment Thursday, April 18, 2013

cluster Thursday, April 18, 2013

cluster ✓ Thursday, April 18, 2013

Build Flow - Chef hadooppocdndev021 if node.name.match(/^hadoop.*(dn|nn|jt|bastion).*[0-9]../) hostptrn = node.hostname.gsub(/^hadoop/,"")
hostenv = hostptrn.match(/prod[0-9]..|dev[0-9]..|stage[0-9]..|qa[0-9]../)[0].gsub(/[0-9]../,"") hostrole = hostptrn.gsub(/#{hostenv}[0-9]../,"").match(/(jt|nn|dn|bastion)$/)[0] hostnum = hostptrn.match(/[0-9]../)[0] cluster = hostptrn.gsub(/#{hostenv}[0-9]../,"").gsub(/#{hostrole}$/,"") primarynn = node.name.gsub(/(jt|nn|dn|bastion)(#{hostenv}[0-9]..)/,"nn#{hostenv}001") jobtracker = node.name.gsub(/(jt|nn|dn|bastion)(#{hostenv}[0-9]..)/,"jt#{hostenv}001") pribastion = node.name.gsub(/(jt|nn|dn|bastion)(#{hostenv}[0-9]..)/,"bastion#{hostenv}001") hostenv = dev hostrole = dn hostnum = 021 cluster = poc Parse the hostname Thursday, April 18, 2013

Build Flow - Chef hadooppocdndev021 if node.name.match(/^hadoop.*(dn|nn|jt|bastion).*[0-9]../) hostptrn = node.hostname.gsub(/^hadoop/,"")
hostenv = hostptrn.match(/prod[0-9]..|dev[0-9]..|stage[0-9]..|qa[0-9]../)[0].gsub(/[0-9]../,"") hostrole = hostptrn.gsub(/#{hostenv}[0-9]../,"").match(/(jt|nn|dn|bastion)$/)[0] hostnum = hostptrn.match(/[0-9]../)[0] cluster = hostptrn.gsub(/#{hostenv}[0-9]../,"").gsub(/#{hostrole}$/,"") primarynn = node.name.gsub(/(jt|nn|dn|bastion)(#{hostenv}[0-9]..)/,"nn#{hostenv}001") jobtracker = node.name.gsub(/(jt|nn|dn|bastion)(#{hostenv}[0-9]..)/,"jt#{hostenv}001") pribastion = node.name.gsub(/(jt|nn|dn|bastion)(#{hostenv}[0-9]..)/,"bastion#{hostenv}001") hostenv = dev hostrole = dn hostnum = 021 cluster = poc primarynn = hadooppocnndev001 jobtracker = hadooppocjtdev001 pribastion = hadooppocbastiondev001 Parse the hostname Thursday, April 18, 2013

Build Flow - Chef hadooppocdndev021 group "hdfs" do gid 50101
end group "mapred" do gid 50102 end user "hdfs" do uid 50101 gid "hdfs" shell "/bin/bash" home "/var/lib/hdfs" comment "Hadoop HDFS" end user "mapred" do uid 50102 gid "mapred" shell "/bin/bash" home "/var/lib/hadoop-mapreduce" comment "Hadoop MapReduce" end group "hadoop" do gid 50100 members ['hdfs','mapred'] end Create the the hadoop users Thursday, April 18, 2013

Install the Hadoop packages Build Flow - Chef hadooppocdndev021 common_packages
= [ 'oracle-j2sdk1.6','cloudera-manager-agent','bigtop-utils','bigtop-jsvc','bigtop-tomcat', 'hadoop','hadoop-hdfs','hadoop-httpfs','hadoop-mapreduce','hadoop-client','hadoop-hdfs-fuse', 'hbase','hive','oozie','pig','hue-plugins','hue-common','hue-shell','hue','sqoop','flume-ng', 'hadoop-lzo','lzop','liblzo2-dev','bc' ] nn_packages = [ 'cloudera-manager-server','cloudera-manager-server-db', ] dn_packages = [ ] bastion_packages = [ 'elephantbird','hue-server','hivelibs' ] Thursday, April 18, 2013

= [ 'oracle-j2sdk1.6','cloudera-manager-agent','bigtop-utils','bigtop-jsvc','bigtop-tomcat', 'hadoop','hadoop-hdfs','hadoop-httpfs','hadoop-mapreduce','hadoop-client','hadoop-hdfs-fuse', 'hbase','hive','oozie','pig','hue-plugins','hue-common','hue-shell','hue','sqoop','flume-ng', 'hadoop-lzo','lzop','liblzo2-dev','bc' ] nn_packages = [ 'cloudera-manager-server','cloudera-manager-server-db', ] dn_packages = [ ] bastion_packages = [ 'elephantbird','hue-server','hivelibs' ] ########## Packages ############# common_packages.each do |pkg| package pkg do action :install end end ########## bastion packages if hostrole == "bastion" bastion_packages.each do |pkg| package pkg do action :install end end end Thursday, April 18, 2013

= [ 'oracle-j2sdk1.6','cloudera-manager-agent','bigtop-utils','bigtop-jsvc','bigtop-tomcat', 'hadoop','hadoop-hdfs','hadoop-httpfs','hadoop-mapreduce','hadoop-client','hadoop-hdfs-fuse', 'hbase','hive','oozie','pig','hue-plugins','hue-common','hue-shell','hue','sqoop','flume-ng', 'hadoop-lzo','lzop','liblzo2-dev','bc' ] nn_packages = [ 'cloudera-manager-server','cloudera-manager-server-db', ] dn_packages = [ ] bastion_packages = [ 'elephantbird','hue-server','hivelibs' ] ########## Name node packages if hostrole == "nn" nn_packages.each do |pkg| package pkg do action :install end end include_recipe "mysql::server" include_recipe "database::mysql" end Thursday, April 18, 2013

Conﬁgurations Build Flow - Chef hadooppocdndev021 ########## Primary name node
only if hostrole == "nn" && hostnum == "001" if Dir['/var/lib/cloudera-scm-server-db/data/*'].empty? execute "scm_db_init" do command "/etc/init.d/cloudera-scm-server-db initdb" end end service "cloudera-scm-server-db" do supports :status => true, :restart => true action [ :enable, :start ] end service "cloudera-scm-server" do supports :status => true, :restart => true action [ :enable, :start ] end end Create the SCM manager DB if it doesn’t exists. Enable & start the DB service. Enable & start the SCM manager. Thursday, April 18, 2013

Conﬁgurations Build Flow - Chef hadooppocdndev021 ########## DNs & NNs
- Config & start SCM agent & point it to primary name node if hostrole == "dn" || hostrole == "nn" || hostrole == "jt" ruby_block "Update SCM config.ini" do block do cluster_scm = primarynn rc = Chef::Util::FileEdit.new("/etc/cloudera-scm-agent/config.ini") rc.search_file_replace_line(/^server_host=localhost/, "server_host=#{primarynn}") rc.write_file end end service "cloudera-scm-agent" do supports :status => true, :restart => true action [ :enable, :start ] end Update Cloudera SCM agent conﬁg to point to the primary name node. Enable & start the SCM agent service. Thursday, April 18, 2013

node['mysql']['server_debian_password'] = "..." node['mysql']['server_repl_password'] = "..." node['mysql']['server_root_password'] = "..." ###
Clodera SCM / Hive server settings node['mysql']['tunable']['key_buffer'] = "16M" node['mysql']['tunable']['key_buffer_size'] = "32M" node['mysql']['tunable']['max_allowed_packet'] = "16M" node['mysql']['tunable']['thread_stack'] = "128K" node['mysql']['tunable']['thread_cache_size'] = "64" node['mysql']['tunable']['query_cache_limit'] = "8M" node['mysql']['tunable']['query_cache_size'] = "64M" node['mysql']['tunable']['query_cache_type'] = "1" node['mysql']['tunable']['max_connections'] = "600" node['mysql']['tunable']['read_buffer_size'] = "2M" node['mysql']['tunable']['read_rnd_buffer_size'] = "16M" node['mysql']['tunable']['sort_buffer_size'] = "8M" node['mysql']['tunable']['join_buffer_size'] = "8M" node['mysql']['tunable']['innodb_file_per_table'] = "1" node['mysql']['tunable']['innodb_flush_log_at_trx_commit'] = "2" node['mysql']['tunable']['innodb_log_buffer_size'] = "64M" node['mysql']['tunable']['innodb_buffer_pool_size'] = "2048M" node['mysql']['tunable']['innodb_thread_concurrency'] = "8" node['mysql']['tunable']['innodb_flush_method'] = "O_DIRECT" node['mysql']['tunable']['character-set-server'] = "latin1" node['mysql']['tunable']['collation-server'] = "latin1_swedish_ci" dbconn = {:host => "localhost", :username => 'root', :password => node['mysql']['server_root_password']} MySQL Conﬁgurations Build Flow - Chef hadooppocdndev021 Thursday, April 18, 2013

if hostrole == "bastion" if hostnum == "001" hivepwd =
"..." mysql_database 'hive' do connection dbconn action :create end mysql_database_user 'hive' do connection dbconn password hivepwd host '%' database_name 'hive' action :grant end end template "/etc/hive/conf/hive-site.xml" do source "hadoop/hive-site.xml.erb" variables({ :pribastion => pribastion, :namenode => primarynn }) end MySQL Conﬁgurations Build Flow - Chef hadooppocdndev021 Thursday, April 18, 2013

Build Flow - Chef hadooppocdndev021 template "/etc/hive/conf/hive-site.xml" do source "hadoop/hive-site.xml.erb"
variables({ :pribastion => pribastion, :namenode => primarynn }) <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://<%= @pribastion %>:3306/hive?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> ... Hive Conﬁgurations Thursday, April 18, 2013

cluster ✓ ✓ Thursday, April 18, 2013

hadooppocdndev021 • REST API • Basic HTTP authentication • Takes
& returns JSON • HTTP: POST - Create entries GET - Read entires PUT - Update entries DELETE - Delete entries Build Flow - API Thursday, April 18, 2013

hadooppocdndev021 ruby_block "Hadoop Provision" do block do hostvolumes=`mount | grep
'/data/[0-9].' | awk '{print $3}' | sort -V`.split(/\n/).join(',') # Get list of /data mounts result = Net::HTTP.get(URI.parse("http://ks.shopzilla.laxhq/hadoop/hadoopprov.py?\ hostname=#{node.name}&\ cluster=#{cluster}&\ hostnum=#{hostnum}&\ role=#{hostrole}&\ env=#{hostenv}&\ primarynn=#{primarynn}&\ volumes=#{hostvolumes}&\ memory=#{node.memory.total}")) end end Build Flow - API hostenv = dev hostrole = dn hostnum = 021 cluster = poc primarynn = hadooppocnndev001 jobtracker = hadooppocjtdev001 pribastion = hadooppocbastiondev001 Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 Python interface to Cloudera API ! hostname=hadooppocdndev021.shopzilla.sea !
cluster=poc ! hostnum=021 ! role=dn ! env=dev ! primarynn=hadooppocnndev001.shopzilla.sea ! volumes=/data/1,/data/2,/data/3,/data/4 ! memory=67108864kB Build Flow - API Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 Python interface to Cloudera API hadooppocnndev001 ! hostname=hadooppocnndev001.shopzilla.sea
! cluster=poc ! hostnum=001 ! role=nn ! env=dev ! primarynn=hadooppocnndev001.shopzilla.sea ! volumes=/data/1 ! memory=67108864kB Build Flow - API Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 hadooppocnndev001 1. Assemble a cluster name - "<cluster>
<env> Cluster" = “POC DEV Cluster” 2. Check if host is checked into SCM. 3. Check if host already has roles assigned. If it does, abort. 4. Get a list of configured clusters from SCM. Is “POC DEV Cluster” one of them? 5. Get a list of configured services from SCM. 6. Pull down our configuration templates from Git. 7. NN - If there is no cluster “POC DEV Cluster”, create it. DN - 1. If there is “POC DEV Cluster”, add host to it. 2. Assign the node to the running services. 3. Calculate map/reduce slots based on host’s RAM & data volumes. Build Flow - API Thursday, April 18, 2013

GET - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/hosts/hadooppocdndev021.shopzilla.sea http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 hadooppocnndev001 2. Check if host
is checked into SCM. hadooppocdndev021 { ... "hostname" : "hadooppocdndev021.shopzilla.sea", "ipAddress" : "10.101.173.35", "lastHeartbeat" : "2013-04-04T12:52:45.764Z", ... "roleRefs" : [ { "roleName" : "poc-hdfs-021", "serviceName" : "poc-hdfs", "clusterName" : "POC DEV Cluster" }, { "roleName" : "poc-mapred-021", "serviceName" : "poc-mapred", "clusterName" : "POC DEV Cluster" } ] } OR 3. Check if host already has roles assigned. If it does, abort. “roleRefs” should be empty Build Flow - API { "message" : "Host 'hadooppocdndev021.shopzilla.sea' not found." } Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 hadooppocnndev001 4. Get a list of conﬁgured clusters
from SCM. hadooppocdndev021 { "items" : [ { "name" : "POC DEV Cluster", "version" : "CDH4", "maintenanceMode" : false, "maintenanceOwners" : [ ] } ] } { "items" : [ { "name" : "poc-hdfs", "type" : "HDFS", "displayName" : "poc-hdfs", ... "serviceState" : "STARTED", ... }, { "name" : "poc-mapred", "type" : "MAPREDUCE", ... } ] } 5. Get a list of conﬁgured services from SCM. GET - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters GET - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services Build Flow - API Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 hadooppocnndev001 6. Pull down our conﬁg templates from
Git. hadooppocdndev021 Will load poc_hadoop.json, or defaults_hadoop.json if POC does not have its own Build Flow - API default_hadoop.json poc_hadoop.json ods_hadoop.json Thursday, April 18, 2013

Git. hadooppocdndev021 ! "#$%&'(%)"*+! ++++",-./012/3-4%5)&"*+6 ++++++++! ++++++++++++",-./012/"*+"78089:7;"<+ ++++++++++++"5=/>&"+*6 ++++++++++++++++!+"4?>/"*+"$%&'@?.?4(/'@?4$A5$=#B/,C/("<+"D?.E/"*+FGHGIIJJ++K< ++++++++++++++++!+"4?>/"*+"$%&'$?=?4-$/'$E',/&/,D/$"<+"D?.E/"*+LJMNMHLIGHJ+K< ++++++++++++++++!+"4?>/"*+"$%&'$?=?4-$/'>?O'O(5/D/,&"<+"D?.E/"*+ILPG+K< ++++++++++++++++!+"4?>/"*+"$%&'$?=?'$5,'.5&="<+"D?.E/"*+"Q$?=?QLQ$%&Q$4"+K< ++++++++++++++++!+"4?>/"*+"$?=?4-$/'R?D?'#/?2&5S/"<+"D?.E/"*+GLHMHINTHI+K ++++++++++++U ++++++++K< ++++++++! ++++++++++++",-./012/"*+"98V;9:7;"<+ ++++++++++++"5=/>&"+*6 ++++++++++++++++!+"4?>/"*+"%&'=,?&#'54=/,D?."<+"D?.E/"*+LHHJ+K< ++++++++++++++++!+"4?>/"*+"$%&'4?>/'$5,'.5&="<+"D?.E/"*+"Q$?=?QLQ$%&Q44"+K ++++++++++++U ++++++++K ++++U< ++++"5=/>&"*+6 ++++++++++++!+"4?>/"*+"$%&'@.-(W'&5S/"<+"D?.E/"*+LNHGLMMGI+K< ++++++++++++!+"4?>/"*+"$%&'(.5/4='E&/'$?=?4-$/'#-&=4?>/"<+"D?.E/"*+"=,E/"K< ++++++++++++!+"4?>/"*+"$%&'2/,>5&&5-4&'&E2/,),-E2"<+"D?.E/"*+"&E2/,),-E2"+K ++++U K< HDFS service wide conﬁgurations Build Flow - API Thursday, April 18, 2013

Git. hadooppocdndev021 ">?2,/$'(%)"*++! ++++",-./012/3-4%5)&"*+6 ++++++++! ++++++++++++",-./012/"*+"X80;Y8Z"< ++++++++++++"5=/>&"*+6 ++++++++++++++++!+"4?>/"*+">?2,/$'(#5.$'R?D?'-2=&'>?O'#/?2"<+"D?.E/"*+GLHMHINTHI+K< ++++++++++++++++!+"4?>/"*+"5-'&-,='>@"<+"D?.E/"*+GFT+K< ++++++++++++++++!+"4?>/"*+">?2,/$'>?2'=?&W&'&2/(E.?=5D/'/O/(E=5-4"<+"D?.E/"*+"%?.&/"+K< ++++++++++++++++!+"4?>/"*+">?2,/$',/$E(/'=?&W&'&2/(E.?=5D/'/O/(E=5-4"<+"D?.E/"*+"%?.&/"+K< ++++++++++++UK< ++++++++! ++++++++++++",-./012/"*+"08C[0\83[;\"< ++++++++++++"5=/>&"*+6 ++++++++++++++++!+"4?>/"*+"=?&W'=,?(W/,'R?D?'#/?2&5S/"<+"D?.E/"*+GLHMHINTHI+K ++++++++++++UK< ++++++++! ++++++++++++",-./012/"*+"]:^0\83[;\"< ++++++++++++"5=/>&"*+6 ++++++++++++++++!+"4?>/"*+"A/@54=/,%?(/'2,5D?=/'?(=5-4&"<+"D?.E/"*+"%?.&/"+K< ++++++++++++++++!+"4?>/"*+">?2,/$'R-@=,?(W/,'=?&WC(#/$E./,"<+"D?.E/"*+"-,)_?2?(#/_#?$--2_>?2,/$_`?5,C(#/$E./,"+K< ++++++++++++++++!+"4?>/"*+"R-@=,?(W/,'>?2,/$'.-(?.'$5,'.5&="<+"D?.E/"+*+"Q$?=?QLQ>?2,/$QR="+K< ++++++++++++++++!+"4?>/"*+">?2,/$'R-@'=,?(W/,'#?4$./,'(-E4="<+"D?.E/"+*+"HI"+K ++++++++++++UK ++++U< ++++"5=/>&"*+6 ++++++++!+"4?>/"*+"#$%&'&/,D5(/"<+"D?.E/"*+"$&a#$%&"+K< ++++++++!+"4?>/"*+"5-'%5./'@E%%/,'&5S/"<+"D?.E/"*+TFFNT+K ++++U++ K< Build Flow - API Map/Reduce service wide conﬁgurations Thursday, April 18, 2013

Git. hadooppocdndev021 "&S'2,/%&"*+! ++++++++++++"2,/%&"*+!+ ">?2,/$',?=5-"*+"G*L"<+ ",?>',/&/,D/,')@"*+T< "$/%'&/,D5(/&"*+6"b7`C"<"V8B\;7c3;"<"d::[;;B;\"U<+ "$%&'%?E.='=-./,?4(/'2/,("*+FJ+K ++++K K Build Flow - API In-house preferences Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 hadooppocnndev001 7. NN - If there is no
cluster “POC DEV Cluster”, create it using conﬁgs from templates. POST - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters "sz_prefs": { "def_services": ["HDFS","MAPREDUCE","ZOOKEEPER"] ... load poc_hadoop.json, or defaults_hadoop.json if POC does not have its own { "items" :[{ "name": "POC DEV Cluster", "version": "CDH4", "services": [ { "name": "poc-mapred", "type": "MAPREDUCE", "clusterRef": {"clusterName": "POC DEV Cluster"} },{ "name": "poc-hdfs", "type": "HDFS", "clusterRef": {"clusterName": "POC DEV Cluster"} }, ... ] }] } Build Flow - API Thursday, April 18, 2013

cluster “POC DEV Cluster”, create it using conﬁgs from templates. "hdfs_cfg": { "roleTypeConfigs": [ { "roleType": "DATANODE", "items" :[ { "name": "dfs_balance_bandwidthPerSec", "value": 52428800 }, ] ... } PUT - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-hdfs "mapred_cfg": { "roleTypeConfigs": [ { "roleType": "GATEWAY", "items": [ { "name": "mapred_child_java_opts_max_heap", "value": 2147483648 }, ] ... }, }, PUT - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-mapred Repeat for any other services Build Flow - API Thursday, April 18, 2013

cluster “POC DEV Cluster”, create it using conﬁgs from #4 { "items":[{ "type": "NAMENODE", "name": "poc-hdfs-NN-001", "hostRef": {"hostId": "hadooppocnndev001.shopzilla.sea" } }] } POST - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-hdfs/roles Build Flow - API Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 hadooppocnndev001 7. DN - If there is “POC
DEV Cluster”, add host to it hadooppocdndev021 { "items" : [ { "name" : "POC DEV Cluster", "version" : "CDH4", "maintenanceMode" : false, "maintenanceOwners" : [ ] } ] } { "items" : [ { "name" : "poc-hdfs", "type" : "HDFS", "displayName" : "poc-hdfs", ... "serviceState" : "STARTED", ... }, { "name" : "poc-mapred", "type" : "MAPREDUCE", ... } ] } GET - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/ GET - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster Build Flow - API Thursday, April 18, 2013

http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev021 hadooppocnndev001 7. DN - If there is a
“POC DEV Cluster”, add host to it - HDFS { "items": [{ "type": "DATANODE", "name": “poc-hdfs-021”, "hostRef": {"hostId": "hadooppocdndev021.shopzilla.sea"} }] } POST - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-hdfs/roles # hostvolumes=/data/1,/data/2,/data/3,/data/4 # hdfsvolumes = /data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn,/data/4/dfs/dn { "items": [ {"name": "dfs_data_dir_list", "value": hdfsvolumes } ] } PUT - http://hadooppocn..:7180/api/v2/clusters/POC DEV Cluster/services/poc-hdfs/roles/poc-hdfs-021/conﬁg hadooppocdndev021 Build Flow - API Thursday, April 18, 2013

PUT - http://hado......:7180/api/v2/clusters/POC DEV Cluster/services/poc-mapred/roles/poc-mapred-021/conﬁg # volumes=/data/1,/data/2,/data/3,/data/4 # mapredvolumes=/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local,/data/4/mapred/local #
memory=67108864kB { "items": [ { "name": "tasktracker_mapred_local_dir_list", "value": mapredvolumes }, { "name": "mapred_tasktracker_map_tasks_maximum", "value": mapslots }, { "name": "mapred_tasktracker_reduce_tasks_maximum", "value": redslots } ] } http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev02 hadooppocnndev001 7. DN - If there is “POC DEV Cluster”, add host to it - MAPREDUCE hadooppocdndev021 { "items": [{ "type": "TASKTRACKER", "name": “poc-mapred-021”, "hostRef": {"hostId": "hadooppocdndev021.shopzilla.sea"} }] } POST - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-mapred/roles Build Flow - API Thursday, April 18, 2013

PUT - http://hado......:7180/api/v2/clusters/POC DEV Cluster/services/poc-mapred/roles/poc-mapred-021/conﬁg # volumes=/data/1,/data/2,/data/3,/data/4 # mapredvolumes=/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local,/data/4/mapred/local #
memory=67108864kB { "items": [ { "name": "tasktracker_mapred_local_dir_list", "value": mapredvolumes }, { "name": "mapred_tasktracker_map_tasks_maximum", "value": mapslots }, { "name": "mapred_tasktracker_reduce_tasks_maximum", "value": redslots } ] } mapslots and redslots are calculated based on the ratio provided in the conﬁg based on the available RAM. "sz_prefs": { "prefs": { "mapred_ratio": "2:1", "ram_reserve_gb": 6, } } http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocdndev02 hadooppocnndev001 7. DN - If there is “POC DEV Cluster”, add host to it - MAPREDUCE hadooppocdndev021 Build Flow - API Thursday, April 18, 2013

hadooppocbastiondev001 http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocnndev001 Client conﬁguration Build Flow - API Thursday,
April 18, 2013

hadooppocbastiondev001 http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocnndev001 Client conﬁguration GET - http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services
{ "items" : [ { "name" : "poc-hdfs", "type" : "HDFS", "displayName" : "poc-hdfs", ... "serviceState" : "STARTED", ... }, { "name" : "poc-mapred", "type" : "MAPREDUCE", ... } ] } { "HDFS" : "http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-hdfs/clientConfig", "MAPREDUCE" : "http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-mapred/clientConfig" } Loop over the services & generate conﬁguration links Build Flow - API Thursday, April 18, 2013

hadooppocbastiondev001 http://ks.shopzilla.laxhq/hadoop/hadoopprov.py hadooppocnndev001 Client conﬁguration { "HDFS" : "http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV
Cluster/services/poc-hdfs/clientConfig", "MAPREDUCE" : "http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-mapred/clientConfig" } Loop over the services & generate conﬁguration links Back in Chef - directory "/tmp/hadoopconf" do action :create end if clientconfigs.key?('HDFS') remote_file "/tmp/hadoopconf/hdfs-config.zip" do source URI.escape(clientconfigs['HDFS']) end end if clientconfigs.key?('MAPREDUCE') remote_file "/tmp/hadoopconf/mapred-config.zip" do source URI.escape(clientconfigs['MAPREDUCE']) end end execute "Copy hadoop configs" do command "unzip -o '/tmp/hadoopconf/*.zip' -d /tmp/hadoopconf/ && mv /tmp/hadoopconf/hadoop-conf/* /etc/hadoop/conf/" action :run end Build Flow - API Thursday, April 18, 2013

cluster ✓ ✓ ✓ Thursday, April 18, 2013

API - General http://cloudera.github.io/cm_api/ " Cloudera’s Python library " Complete
documentation Configuration samples from your own cluster http://hadooppocnndev001.shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-hdfs/config http://hadooppo....shopzilla.sea:7180/api/v2/clusters/POC DEV Cluster/services/poc-hdfs/config?view=full " JSON dump of all the configurations you set (non-deafults). " Complete JSON dump of all possible configurations. Thursday, April 18, 2013

Automating the Hadoop Stack with Chef

Automating the Hadoop Stack with Chef

More Decks by San Diego DevOps

Other Decks in Technology

Featured

Transcript