Slide 1

Slide 1 text

OpenStack@IIIT-H Dharmesh Kakadia (@dharmeshkakadia) Shashank Sahni (@shredder12)

Slide 2

Slide 2 text

What we do ● Run an Indian Languages Search Engine ● Research ○ Information Extraction ○ Information Retrieval ○ Information Access ○ Virtualization and Cloud ● Users of ○ OpenStack ○ Hadoop ○ and lot of other FOSS

Slide 3

Slide 3 text

Before OpenStack...

Slide 4

Slide 4 text

Before OpenStack source: http://www.codeproject.com/KB/threads/hxgrid/image4.jpg

Slide 5

Slide 5 text

Problems ● Provisioning ○ Adhoc ○ Time consuming ○ Unmanaged ● User Management ○ No resource accounting ○ Access Control ○ Usage Restriction ● Storage ○ Data reliability ○ Duplication

Slide 6

Slide 6 text

More Problems... ● Cluster ○ Terrible Resource Utilization ○ New deployment => Too much time ○ Data Redundancy ○ Non-optimal deployments ● Academic ○ No cloud platform for experimentation ○ Large Scale sandboxed resource provisioning for students.

Slide 7

Slide 7 text

After OpenStack

Slide 8

Slide 8 text

OpenStack(KVM) ● 7 Compute nodes (8GB, quad-core) ● 1 nova-volume(2 TB, Raid-1) Swift ● 3 storage nodes (2TB each) OpenStack(LXC) ● 16 Compute nodes (6GB, dual core)

Slide 9

Slide 9 text

Provisioning ● Pre-configured images to quickly get started. ● VM of any capacity available at any time( 2 a.m. Sunday morning) ● Snapshots

Slide 10

Slide 10 text

User Management ● Resource restrictions using Quota ● Project based collaboration and private resources ● Usage monitoring

Slide 11

Slide 11 text

Storage This wasn't easy. We experimented with ● nova-volume ● Swift(diablo) ● GlusterFS ● Swift(Folsom)(current)

Slide 12

Slide 12 text

Storage ● Hadoop compatible distributed storage ● Glance image store ● Desktop backup utility using CloudFuse ● Data reliability ● No more Data Fragmentation

Slide 13

Slide 13 text

OpenStack in Academia ● Research ○ Inter cloud migration ○ Inter cloud scheduling ○ Performance Evaluation ● Resource provisioning for course assignments and projects. ○ 3 courses ○ 350+ students ○ 20+ projects

Slide 14

Slide 14 text

HadoopStack ● Big Data processing on Demand ● Entire ecosystem for Big Data - Hadoop Family, Spark, Mahout, R ● Multi-Cloud - OpenStack and AWS.

Slide 15

Slide 15 text

HadoopStack

Slide 16

Slide 16 text

Conclusion ● Using OpenStack ● Working with and around OpenStack ● OpenStack is Awesome !!

Slide 17

Slide 17 text

Questions/Feedback ?