What we do
● Run an Indian Languages Search Engine
● Research
○ Information Extraction
○ Information Retrieval
○ Information Access
○ Virtualization and Cloud
● Users of
○ OpenStack
○ Hadoop
○ and lot of other FOSS
Slide 3
Slide 3 text
Before OpenStack...
Slide 4
Slide 4 text
Before OpenStack
source: http://www.codeproject.com/KB/threads/hxgrid/image4.jpg
Slide 5
Slide 5 text
Problems
● Provisioning
○ Adhoc
○ Time consuming
○ Unmanaged
● User Management
○ No resource accounting
○ Access Control
○ Usage Restriction
● Storage
○ Data reliability
○ Duplication
Slide 6
Slide 6 text
More Problems...
● Cluster
○ Terrible Resource Utilization
○ New deployment => Too much time
○ Data Redundancy
○ Non-optimal deployments
● Academic
○ No cloud platform for experimentation
○ Large Scale sandboxed resource provisioning for
students.
Provisioning
● Pre-configured images to quickly get started.
● VM of any capacity available at any time( 2
a.m. Sunday morning)
● Snapshots
Slide 10
Slide 10 text
User Management
● Resource restrictions using Quota
● Project based collaboration and private
resources
● Usage monitoring
Slide 11
Slide 11 text
Storage
This wasn't easy. We experimented with
● nova-volume
● Swift(diablo)
● GlusterFS
● Swift(Folsom)(current)
Slide 12
Slide 12 text
Storage
● Hadoop compatible distributed storage
● Glance image store
● Desktop backup utility using CloudFuse
● Data reliability
● No more Data Fragmentation
Slide 13
Slide 13 text
OpenStack in Academia
● Research
○ Inter cloud migration
○ Inter cloud scheduling
○ Performance Evaluation
● Resource provisioning for course
assignments and projects.
○ 3 courses
○ 350+ students
○ 20+ projects
Slide 14
Slide 14 text
HadoopStack
● Big Data processing on Demand
● Entire ecosystem for Big Data - Hadoop
Family, Spark, Mahout, R
● Multi-Cloud - OpenStack and AWS.
Slide 15
Slide 15 text
HadoopStack
Slide 16
Slide 16 text
Conclusion
● Using OpenStack
● Working with and around OpenStack
●
OpenStack is Awesome !!