Slide 1

Slide 1 text

Mobile  Media  Site   using  Drupal  &   MongoDB     Aug  7,  2012   Presented  by:   Yash  Badiani,  Big  Data  Practice  Lead,  CIGNEX  Datamatics   Gaurav  Khambhala,  Technical  Lead,  CIGNEX  Datamatics  

Slide 2

Slide 2 text

CIGNEX Datamatics Confidential About  CIGNEX  Datamatics   2

Slide 3

Slide 3 text

CIGNEX Datamatics Confidential What  Does  CIGNEX  Datamatics  Do?   Since  2000,  CIGNEX  Datamatics  has   implemented  over  400  Open  Source  enterprise   solutions  addressing  business  requirements   related  to  Portals,  Content  &  Big  Data   3

Slide 4

Slide 4 text

CIGNEX Datamatics Confidential Where  We  Can  Help  You   4    SOLUTIONS   Managed  Cloud  Services  -­‐  Develop,  Deploy,  Manage   Annual  Product  Subscrip;on:  Liferay,  Alfresco,  Magento,  Hadoop,  Selenium     Extended  Development  Center  –  Center  of  Excellence     UI    Development    Integra;on    Customiza;on    Migra;on    Tes;ng      Training      Support  (24*7)   User  eXperience     PlaOorm   Portals   Liferay,    Magento,   Drupal,  Adobe  CQ   •  Intranet     •  Extranet   •  S o c i a l   Collabora;on     •  Mobile  Portals   •  E-­‐Commerce   Enterprise  Content   Management   Content   Alfresco,      Drupal,     Magento,  Adobe  CQ,    Moodle,      EphesoR       •  WCM   •  DM   •  RM   •  DAM   •  E-­‐Commerce   •  E-­‐learning   •  ERP   •  Imaging          Solu;ons    SERVICES   Making  Data  Work   Big  Data   Hadoop,    MongoDB,     Hbase,    Neo4j,    Solr   •  Analy;cs   •  Mobile   •  Social   •  Web   •  Real-­‐;me       •  DW  -­‐  BI   •  Log  Processing   and  Analysis     •  Enterprise   Search   Velocity   Complexity   Volume   Variety  

Slide 5

Slide 5 text

CIGNEX Datamatics Confidential Part  of  Datamatics  (DGSL)   •  Mission   –  Experts  in  improving  Enterprise   productivity    through    Process   Engineering  &    Information   Management  Solutions   •  Key  Highlights   –  Founded  in  1975   –  Publicly  listed  in  India   –  Annual  consolidated  revenue  of   US$100  Million   –  Fortune  500  clients   –  4,400+  employees  across  22   of[ices  in  9  countries   Strategic  Alliances   5

Slide 6

Slide 6 text

CIGNEX Datamatics Confidential About  the  presenters   •  Yash  Badiani  is  the  Big  Data  Practice  Lead  at  CIGNEX  Datamatics  and   focuses  on  Big  Data  Technologies  including  MongoDB  &  Hadoop.  He  has   worked  extensively  on  large  Data  warehousing  &  Business  Intelligence   projects  with  tools  such  as  Business  Objects,  Microsoft  SQL  Server,   Microstrategy,  IBM  Cognos.         •  Gaurav  Khambhala  works  at  CIGNEX  Datamatics  as  Technical  Lead.  He  is   the  senior  member  of  the  PHP  Practice  at  CIGNEX  Datamatics  and  is   involved  on  various  technology  initiatives  like  Big  Data  where  he  focuses  on   integration  of  PHP  with  NoSQL  sources  like  MongoDB.  He  has  a  wide   industry  experience  in  software  development  &  management  in  Open   Source  technologies  such  as  Drupal,  Moodle  &  Wordpress.       6

Slide 7

Slide 7 text

CIGNEX Datamatics Confidential Agenda   •  The  Mobile  Media  Use  Case   •  Requirements  and  Challenges   •  Solution  :  Mobile  Media  site  using  Drupal    &  MongoDB   •  Why  Drupal  and  MongoDB?   •  Demo  and  Solution  Features   •  Bene[its     •  Summary   7

Slide 8

Slide 8 text

CIGNEX Datamatics Confidential The  Mobile  Explosion!   By   2015,   at   least   60%   of   information   workers   will   interact  with  their  content  applications  via  a  mobile   device   Employees   work   on   proposals   and   presentation   on   mobile  devices  while  travelling   People   use   digital   assets   (videos,   images)   longer   on   Tablets  and  Mobiles    compared  to  desktops   8 Based  on  a  report  by  a  leading  IT  advisory  [irm  

Slide 9

Slide 9 text

CIGNEX Datamatics Confidential Mobile  Media  Use  Case   •  Mobile  Media  site  includes  the  following  features:   –  Store  a  variety  of  Images  &  associated  metadata   –  Massively  Scalable  to  store  billions  of  images   –  Access  through  Mobile   –  Create  /Edit  Albums   –  Add  Images  to  the  Albums   –  Add  /  Edit  Metadata  of  Images   –  Search  Images  /  Albums  by  date,  metadata,  albums,  etc   –  Social  Media  features  –  Likes,  comments     9

Slide 10

Slide 10 text

CIGNEX Datamatics Confidential Requirements  of  Mobile  Media  sites   •  Fast  performance   •  Large  user  base   •  Concurrent  CRUD   •  Access  through                    various  channels     •  Millions  of  digital                  assets   •  Variety  of  content   •  Complexity  of  data   •  Rich  UI  features   •  Social  features   •  Mobile  access   •  Fast  search   •  Elastic  scaling   •  Cost  effectiveness   •  Centralized  storage   •  Ease  of                  Maintenance   •  HIGH  availability   •  Automatic  failover   •  User  management   Velocity   Volume   User     experience   Scalability   Security  &     Availability   10 •  Easy  integration   •  Shorter  dev  cycle   •  Faster  deployment   •  Ease  of  schema                design    Flexibility  &     Agility  

Slide 11

Slide 11 text

CIGNEX Datamatics Confidential Standard  Three  Layered  Data  Architecture   11 File  System   Metadata  in  RDBMS   Search   Standard  Three  Layered   Storage   Application   layer  

Slide 12

Slide 12 text

CIGNEX Datamatics Confidential Limitations  of  RDBMS •  Support  limited  to  terabytes   –  No  support  for  petabytes  to   exabytes   •  Manage  only  structured  data   –  No  support  for  semi-­‐structured  and   unstructured  data   •  RDBMS  don't  scale  inherently   –  Scale  up/Scale  out  (Load  Balancing   &  Replication)   •  Hard  to  shard  /  partition   –  Large  data  [iles   •  Both  read  /  write  throughput  not   possible   –  Transactional  /  Analytical   databases   •  Specialized  hardware  -­‐  expensive   RDBMS  can’t  manage  all  dimensions  of   data  with  speed  &  at  lower  cost.   12

Slide 13

Slide 13 text

CIGNEX Datamatics Confidential NoSQL  is  the  right  solution Not   SQL   Only   •  They  are  schema  less   •  Designed  to  support  huge  data  volumes   –  Facebook  135  billion  messages/month;  Twitter  7TB  data/day   •  Scalable  replication  and  distribution  mechanism   –  Thousands  of  machines  distributed  around  the  world   •  Massive  write  performance  with  asynchronous  inserts  and  updates   •  Designed  to  give  high  query  performance   •  Runs  on  commodity  hardware   •  Most  NoSQL  databases  are  Open  Source   13

Slide 14

Slide 14 text

CIGNEX Datamatics Confidential NoSQL  –  Data  Models Column  Families   Usage:  Read/Write  Intensive     Popular  databases:  Hbase,  Cassandra   Document  Store   Usage:  Working  with  Occasionally     changing/consistent  data   Popular  databases:  CouchDB,  MongoDB   Graph  Database     Usage:  Spatial  Data  storage   Popular  databases:  Neo4j,  Bigdata   Key  Value  /  Tulip  Store   Usage:  Briskly  changing  data  and  high   availability   Popular  databases:  Riak,  Redis,  Azure   Table  storage   NoSQL  Databases   •   4  broad  data  models   •  120+  variants  available  in  the  market   14

Slide 15

Slide 15 text

CIGNEX Datamatics Confidential Requirements  of  Mobile  Media  sites  -­‐  Recap   •  Fast  performance   •  Large  user  base   •  Concurrent  CRUD   •  Access  through                    various  channels     •  Millions  of  digital                  assets   •  Variety  of  content   •  Complexity  of  data   •  Rich  UI  features   •  Social  features   •  Mobile  access   •  Fast  search   •  Elastic  scaling   •  Cost  effectiveness   •  Centralized  storage   •  Ease  of                  maintenance   •  HIGH  availability   •  Automatic  failover   •  User  management   Velocity   Volume   User     experience   Scalability   Security  &     Availability   15 •  Easy  integration   •  Shorter  dev  cycle   •  Faster  deployment   •  Ease  of  schema                design    Flexibility  &     Agility   Mobile  Media  Site  

Slide 16

Slide 16 text

CIGNEX Datamatics Confidential Drupal  with  MongoDB  Solution   Themes   Core   Modules   Nodes   Taxonomy   User   Roles   Forms  &   Menu   PHP     Custom   Modules   Work[low   Forums   Comments   &  Ratings   Tagging   Web   Services   3rd  party  &     Internal   Applications   MongoDB   Driver   Mongos   Routing  Process   Replica  Set   MongoDB   MongoDB   Replica  Set   MongoDB   MongoDB   16

Slide 17

Slide 17 text

CIGNEX Datamatics Confidential Why  Drupal? Pluggable   Architecture   Data  Abstraction   Layer   Easy  to  Upgrade   Secure   Active  Community   Widely  Adopted   Scalable   3rd  Party  Tools     Integration   User  Management   &  Permissions     HTML5  &  CSS   Support   17

Slide 18

Slide 18 text

CIGNEX Datamatics Confidential Websites  using  Drupal   Website:  Whitehouse.gov   Website:  Data.gov.uk   Website:  mtv.co.uk   Website:    research.yahoo.com   Website:  pdx.edu   Website:   EndPoverty2015.org   18

Slide 19

Slide 19 text

CIGNEX Datamatics Confidential Why  MongoDB?   Agile  and   Scalable   Full  Index   Support   Document   Oriented  Storage   Replication   Querying   Atomic  Updates   Data  Processing   and  aggregation   High  Availability   19

Slide 20

Slide 20 text

CIGNEX Datamatics Confidential Customers  using  MongoDB   •  Centralized  data  management  platform   •  2  billion+  documents   •  20  TB  of  photo  metadata     •  TV  episodes  and  series   •  Risk  solutions  auditing  data   Source:  http://www.10gen.com/customers   20

Slide 21

Slide 21 text

CIGNEX Datamatics Confidential Demo       21 •  Media  site  on  mobile  simulator   •  Like  &  comment  on  an  image  on  mobile  simulator   •  Mobile  site  on  web  browser   •  Verify  ‘Like’  &  comment  of  the  same  image  on  web  browser   •  Search  images  &  access  control        

Slide 22

Slide 22 text

CIGNEX Datamatics Confidential Solution    Features   Architecture  and  Design     22

Slide 23

Slide 23 text

CIGNEX Datamatics Confidential Architecture User  Metadata   Indexes   Albums   Image   Metadata   GridFS   Form  API   Drupal  API   Custom   Module   Browser /Mobile   Theme   MongoDB  PHP   Driver   Menu  API   User   Mobile   Device   Image   Metadata   GridFS   23

Slide 24

Slide 24 text

CIGNEX Datamatics Confidential Add  Album   Flow Add  Image   View  Album   View  Individual    Images   Like  Image   Comment     Image   Add  Tags  to     Images   View  Counter   Search  Images   By  Tags   User  Metadata   Albums   GridFS   Image  Metadata   DBRef   DBRef   DBRef   DBRef   MongoDB    Collections   24 User  Actions  

Slide 25

Slide 25 text

CIGNEX Datamatics Confidential Schema  Design   User  Metadata   GridFS   Albums   Image  Metadata   •  User  ID   •  DBRef  (Album)   •  Tags   •  Thumbnail   •  Likes   •  View  Counter   •  Comments   •  Permission   •  FS.Files   •  FS.Chunks   •  User  ID   •  Tags   •  Title   •  Make   •  Model   •  Date  Time   •  Aperture   •  Exposure   •  DBRef  (GridFS)   25

Slide 26

Slide 26 text

CIGNEX Datamatics Confidential Schema  Design   Image  Metadata   Albums   User  Metadata   26

Slide 27

Slide 27 text

CIGNEX Datamatics Confidential MongoDB  Monitoring  Service  (MMS)     27 •  DB  Storage     •  Cursors   •  Replica  Sets   •  Network   Connections   •  Non  Mapped   Virtual  Memory   •  Opcounters  

Slide 28

Slide 28 text

CIGNEX Datamatics Confidential Bene[its Drupal   MongoDB   Most  advanced  content  management     solutions   Scalability  –  billions  of  content  items,   millions  of  users   Highly  customized  websites   Performance  –  FAST  writes  through   sharding,  reads  through  indexes   Most  search  friendly  CMS   Data  safety  through  replication   Less  coding,  high  on  automation   Centralized  single  system  for  data   storage     Powered  by  7000  plugins  and   extensions   Monitoring  through  MMS     Active  community,  real  time  assistance     Enterprise  support  through  10gen     28

Slide 29

Slide 29 text

CIGNEX Datamatics Confidential Summary  &  Key  Takeaways •  MongoDB  provides  the  RIGHT  [it  for  CMS  applications  with   [lexibility,  scale  &  speed •  Drupal’s  advanced  &  automated  CMS  features  and  tight   integration  with  MongoDB  makes  it  the  right  choice  for   building  agile  websites   •  Both  Drupal  &  MongoDB  are  feature  rich  and  being  Open   Source,  provide  signi[icant  cost  bene[its     29

Slide 30

Slide 30 text

CIGNEX Datamatics Confidential Thank  you.  Questions? CIGNEX  Datamatics  makes  Open  Source  work  for  you!                         30 Yash  Badiani   Big  Data  Practice  Lead   [email protected]   Gaurav  Khambhala   Technical  Lead   [email protected]