Distributed Data Processing Platforsm.

1 هدش عیزوت هداد شزادرپ هاگراک یتشهبدیهش -سیدرپ رتویپماک یسدنهم
و مولع هدکشناد :سرد هدش عیزوت هداد هاگیاپ :داتسا ییابطابط یداه رتکد :هئارا یقیدص لضفلاوبا نابآ ۱۳۹۳

Distributed Data Processing School of Computer Science and Engineering A.
Sedighi @amirsedighi Hexican.com [email protected]

3 Every Game needs it's Playing Yard

4 Every Game needs it's Playing Yard

5 What can I do on a Single Machine? •
MVC Programming • Regular Biz Apps • 100 GBs Data • Web Surfing • ...

6 Linux Cluster

9 Introduction This is a 4 sessions, hands-on, step-by-step tutorial
on setting up, a Linux cluster on your machine (Notebook or PC), to try a few number of big-data processing frameworks and tools.

10 What we are going to do? • Your notebook,
or a PC is just enough for starting. – Setting your Linux cluster up. • Distributed Log Management and Realtime Search-Engines – What is Elasticsearch? – Elasticsearch on the cluster. – Monitoring and Usage. • The most popular Distributed Data Processing Framework. – What is Apache Hadoop? – Apache Hadoop on the cluster. – Using Scenarios.

11 What we would Learn? • Leveraging our knowledge of
Big-Data. • Getting familiar with distributed data processing. • Maximizing availability and reliability. • Increasing data storage capacity. • Leveraging data processing performance. • Data locality is a silver bullet. • Increasing cluster utilization. • Taming giants by giving them a try.

12 Preparing the Linux Cluster - VirtualBox

13 Preparing the Cluster - Hosting • VirtualBox – Memory
Size, Disk Capacity and CPU cores. – Network Interfaces. • NAT, provides Internet. • Host-Only, provides cluster communication.

14 Preparing the Cluster – Adding a Host-Only Network

15 Preparing the Cluster – Adding a NAT Interface

16 Preparing the Cluster – Adding a Host-Only Interface

17 Preparing the Cluster – First Node • Creating a
Linux machine inside VirtualBox. • Installing Linux. (I've used Ubuntu 12.04) – Check Samba – Check OpenSSH • Give the first node all. – Having an “install” folder on. – Having primitives such as Java installed on. • Shutting down the first node.

18 Preparing the Cluster – Cloning, The Virtual Box Side
• Cloning the first node. (tutorial)

19 Preparing the Cluster – Cloning, the Linux side •
Turning the new node on. • Network configuration – sudo nano /etc/hosts – sudo nano /etc/hostname – sudo nano /etc/network/interfaces – sudo rm /etc/udev/rules.d/70-persistent-net.rules • sudo reboot

20 Preparing the Cluster – No Password Login • Do
this: – ssh-keygen – ssh-copy-id -i ~/.ssh/id_rsa.pub user@host • Or this: – ssh-keygen -t dsa -p '' -f ~/.ssh/id_dsa – scp .ssh/id_rsa.pub user@host:~/master_key – ssh user@host – cat master_key >> ./ssh/authorized_keys

21 Preparing the Cluster – Distributed Shell • Do it
like a Commander – Installing DSH (Optional)

22 Preparing the Cluster – Enjoy it • To scale
your cluster just repeat the cloning step.

23 Next? • An introduction to distributed Log Management and
analytical search-engines. – How Elasticsearch works? – Workshop. • An introduction to Apache Hadoop – How Apache Hadoop works? – Workshop.

Distributed Data Processing Platforsm.

Distributed Data Processing Platforsm.

Amir Sedighi

More Decks by Amir Sedighi

Other Decks in Programming

Featured

Transcript

1 هدش عیزوت هداد شزادرپ هاگراک یتشهبدیهش -سیدرپ رتویپماک یسدنهم

Distributed Data Processing School of Computer Science and Engineering A.

3 Every Game needs it's Playing Yard

4 Every Game needs it's Playing Yard

5 What can I do on a Single Machine? •

6 Linux Cluster

7

8

9 Introduction This is a 4 sessions, hands-on, step-by-step tutorial

10 What we are going to do? • Your notebook,

11 What we would Learn? • Leveraging our knowledge of

12 Preparing the Linux Cluster - VirtualBox

13 Preparing the Cluster - Hosting • VirtualBox – Memory

14 Preparing the Cluster – Adding a Host-Only Network

15 Preparing the Cluster – Adding a NAT Interface

16 Preparing the Cluster – Adding a Host-Only Interface

17 Preparing the Cluster – First Node • Creating a

18 Preparing the Cluster – Cloning, The Virtual Box Side

19 Preparing the Cluster – Cloning, the Linux side •

20 Preparing the Cluster – No Password Login • Do

21 Preparing the Cluster – Distributed Shell • Do it

22 Preparing the Cluster – Enjoy it • To scale

23 Next? • An introduction to distributed Log Management and