Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mobile Media Site using Drupal and MongoDB with 10gen and CIGNEX Datamatics

mongodb
August 07, 2012
350

Mobile Media Site using Drupal and MongoDB with 10gen and CIGNEX Datamatics

Join In for a webinar showcasing an integration of Drupal and MongoDB to create a Mobile Media Site presented by 10gen and CIGNEX Datamatics.

The webinar will feature an Online & Mobile Media site developed in Drupal to store millions of digital photos with metadata in MongoDB. The site provides seamless features to create and manage albums, extract and store metadata and advanced search across a large repository for instantaneous retrieval of images.

The solution will demonstrate how MongoDB can be leveraged to store and search millions of images and associated metadata in a centralized and massively scalable repository.

mongodb

August 07, 2012
Tweet

Transcript

  1. Mobile  Media  Site  
    using  Drupal  &  
    MongoDB  
     
    Aug  7,  2012
     
    Presented  by:  
    Yash  Badiani,  Big  Data  Practice  Lead,  CIGNEX  Datamatics  
    Gaurav  Khambhala,  Technical  Lead,  CIGNEX  Datamatics  

    View Slide

  2. CIGNEX Datamatics Confidential
    About  CIGNEX  Datamatics
     
    2

    View Slide

  3. CIGNEX Datamatics Confidential
    What  Does  CIGNEX  Datamatics  Do?  
    Since  2000,  CIGNEX  Datamatics  has  
    implemented  over  400  Open  Source  enterprise  
    solutions  addressing  business  requirements  
    related  to  Portals,  Content  &  Big  Data
     
    3

    View Slide

  4. CIGNEX Datamatics Confidential
    Where  We  Can  Help  You  
    4  
     SOLUTIONS  
    Managed  Cloud  Services  -­‐  Develop,  Deploy,  Manage  
    Annual  Product  Subscrip;on:  Liferay,  Alfresco,  Magento,  Hadoop,  Selenium    
    Extended  Development  Center  –  Center  of  Excellence    
    UI    Development    Integra;on    Customiza;on    Migra;on    Tes;ng      Training      Support  (24*7)  
    User  eXperience    
    PlaOorm  
    Portals   Liferay,    Magento,  
    Drupal,  Adobe  CQ  
    •  Intranet    
    •  Extranet  
    •  S o c i a l  
    Collabora;on  
     
    •  Mobile  Portals  
    •  E-­‐Commerce  
    Enterprise  Content  
    Management  
    Content   Alfresco,      Drupal,    
    Magento,  Adobe  CQ,  
     Moodle,      EphesoR  
     
     
    •  WCM  
    •  DM  
    •  RM  
    •  DAM  
    •  E-­‐Commerce  
    •  E-­‐learning  
    •  ERP  
    •  Imaging  
           Solu;ons  
     SERVICES  
    Making  Data  Work  
    Big  Data   Hadoop,    MongoDB,    
    Hbase,    Neo4j,    Solr  
    •  Analy;cs  
    •  Mobile  
    •  Social  
    •  Web  
    •  Real-­‐;me  
     
     
    •  DW  -­‐  BI  
    •  Log  Processing  
    and  Analysis    
    •  Enterprise  
    Search  
    Velocity  
    Complexity  
    Volume  
    Variety  

    View Slide

  5. CIGNEX Datamatics Confidential
    Part  of  Datamatics  (DGSL)  
    •  Mission  
    –  Experts  in  improving  Enterprise  
    productivity    through    Process  
    Engineering  &    Information  
    Management  Solutions  
    •  Key  Highlights  
    –  Founded  in  1975  
    –  Publicly  listed  in  India  
    –  Annual  consolidated  revenue  of  
    US$100  Million  
    –  Fortune  500  clients  
    –  4,400+  employees  across  22  
    of[ices  in  9  countries  
    Strategic  Alliances
     
    5

    View Slide

  6. CIGNEX Datamatics Confidential
    About  the  presenters  
    •  Yash  Badiani  is  the  Big  Data  Practice  Lead  at  CIGNEX  Datamatics  and  
    focuses  on  Big  Data  Technologies  including  MongoDB  &  Hadoop.  He  has  
    worked  extensively  on  large  Data  warehousing  &  Business  Intelligence  
    projects  with  tools  such  as  Business  Objects,  Microsoft  SQL  Server,  
    Microstrategy,  IBM  Cognos.    
       
    •  Gaurav  Khambhala  works  at  CIGNEX  Datamatics  as  Technical  Lead.  He  is  
    the  senior  member  of  the  PHP  Practice  at  CIGNEX  Datamatics  and  is  
    involved  on  various  technology  initiatives  like  Big  Data  where  he  focuses  on  
    integration  of  PHP  with  NoSQL  sources  like  MongoDB.  He  has  a  wide  
    industry  experience  in  software  development  &  management  in  Open  
    Source  technologies  such  as  Drupal,  Moodle  &  Wordpress.    
     
    6

    View Slide

  7. CIGNEX Datamatics Confidential
    Agenda  
    •  The  Mobile  Media  Use  Case  
    •  Requirements  and  Challenges  
    •  Solution  :  Mobile  Media  site  using  Drupal    &  MongoDB  
    •  Why  Drupal  and  MongoDB?  
    •  Demo  and  Solution  Features  
    •  Bene[its    
    •  Summary  
    7

    View Slide

  8. CIGNEX Datamatics Confidential
    The  Mobile  Explosion!  
    By   2015,   at   least   60%   of   information   workers   will  
    interact  with  their  content  applications  via  a  mobile  
    device  
    Employees   work   on   proposals   and   presentation   on  
    mobile  devices  while  travelling  
    People   use   digital   assets   (videos,   images)   longer   on  
    Tablets  and  Mobiles    compared  to  desktops  
    8
    Based  on  a  report  by  a  leading  IT  advisory  [irm  

    View Slide

  9. CIGNEX Datamatics Confidential
    Mobile  Media  Use  Case  
    •  Mobile  Media  site  includes  the  following  features:  
    –  Store  a  variety  of  Images  &  associated  metadata  
    –  Massively  Scalable  to  store  billions  of  images  
    –  Access  through  Mobile  
    –  Create  /Edit  Albums  
    –  Add  Images  to  the  Albums  
    –  Add  /  Edit  Metadata  of  Images  
    –  Search  Images  /  Albums  by  date,  metadata,  albums,  etc  
    –  Social  Media  features  –  Likes,  comments  
     
    9

    View Slide

  10. CIGNEX Datamatics Confidential
    Requirements  of  Mobile  Media  sites  
    •  Fast  performance  
    •  Large  user  base  
    •  Concurrent  CRUD  
    •  Access  through    
                   various  channels    
    •  Millions  of  digital    
                 assets  
    •  Variety  of  content  
    •  Complexity  of  data  
    •  Rich  UI  features  
    •  Social  features  
    •  Mobile  access  
    •  Fast  search  
    •  Elastic  scaling  
    •  Cost  effectiveness  
    •  Centralized  storage  
    •  Ease  of    
                 Maintenance  
    •  HIGH  availability  
    •  Automatic  failover  
    •  User  management  
    Velocity   Volume  
    User    
    experience  
    Scalability  
    Security  &    
    Availability  
    10
    •  Easy  integration  
    •  Shorter  dev  cycle  
    •  Faster  deployment  
    •  Ease  of  schema    
               design  
     Flexibility  &    
    Agility  

    View Slide

  11. CIGNEX Datamatics Confidential
    Standard  Three  Layered  Data  Architecture  
    11
    File  System  
    Metadata  in  RDBMS  
    Search  
    Standard  Three  Layered  
    Storage  
    Application  
    layer  

    View Slide

  12. CIGNEX Datamatics Confidential
    Limitations  of  RDBMS
    •  Support  limited  to  terabytes  
    –  No  support  for  petabytes  to  
    exabytes  
    •  Manage  only  structured  data  
    –  No  support  for  semi-­‐structured  and  
    unstructured  data  
    •  RDBMS  don't  scale  inherently  
    –  Scale  up/Scale  out  (Load  Balancing  
    &  Replication)  
    •  Hard  to  shard  /  partition  
    –  Large  data  [iles  
    •  Both  read  /  write  throughput  not  
    possible  
    –  Transactional  /  Analytical  
    databases  
    •  Specialized  hardware  -­‐  expensive  
    RDBMS  can’t  manage  all  dimensions  of  
    data  with  speed  &  at  lower  cost.  
    12

    View Slide

  13. CIGNEX Datamatics Confidential
    NoSQL  is  the  right  solution
    Not   SQL  
    Only  
    •  They  are  schema  less  
    •  Designed  to  support  huge  data  volumes  
    –  Facebook  135  billion  messages/month;  Twitter  7TB  data/day  
    •  Scalable  replication  and  distribution  mechanism  
    –  Thousands  of  machines  distributed  around  the  world  
    •  Massive  write  performance  with  asynchronous  inserts  and  updates  
    •  Designed  to  give  high  query  performance  
    •  Runs  on  commodity  hardware  
    •  Most  NoSQL  databases  are  Open  Source  
    13

    View Slide

  14. CIGNEX Datamatics Confidential
    NoSQL  –  Data  Models
    Column  Families  
    Usage:  Read/Write  Intensive    
    Popular  databases:  Hbase,  Cassandra  
    Document  Store  
    Usage:  Working  with  Occasionally    
    changing/consistent  data  
    Popular  databases:  CouchDB,  MongoDB  
    Graph  Database  
     
    Usage:  Spatial  Data  storage  
    Popular  databases:  Neo4j,  Bigdata  
    Key  Value  /  Tulip  Store  
    Usage:  Briskly  changing  data  and  high  
    availability  
    Popular  databases:  Riak,  Redis,  Azure  
    Table  storage  
    NoSQL  Databases  
    •   4  broad  data  models  
    •  120+  variants  available  in  the  market  
    14

    View Slide

  15. CIGNEX Datamatics Confidential
    Requirements  of  Mobile  Media  sites  -­‐  Recap  
    •  Fast  performance  
    •  Large  user  base  
    •  Concurrent  CRUD  
    •  Access  through    
                   various  channels    
    •  Millions  of  digital    
                 assets  
    •  Variety  of  content  
    •  Complexity  of  data  
    •  Rich  UI  features  
    •  Social  features  
    •  Mobile  access  
    •  Fast  search  
    •  Elastic  scaling  
    •  Cost  effectiveness  
    •  Centralized  storage  
    •  Ease  of    
                 maintenance  
    •  HIGH  availability  
    •  Automatic  failover  
    •  User  management  
    Velocity   Volume  
    User    
    experience  
    Scalability  
    Security  &    
    Availability  
    15
    •  Easy  integration  
    •  Shorter  dev  cycle  
    •  Faster  deployment  
    •  Ease  of  schema    
               design  
     Flexibility  &    
    Agility  
    Mobile  Media  Site  

    View Slide

  16. CIGNEX Datamatics Confidential
    Drupal  with  MongoDB  Solution  
    Themes  
    Core  
    Modules  
    Nodes   Taxonomy  
    User  
    Roles  
    Forms  &  
    Menu  
    PHP    
    Custom  
    Modules  
    Work[low   Forums  
    Comments  
    &  Ratings  
    Tagging  
    Web  
    Services  
    3rd  party  &    
    Internal  
    Applications  
    MongoDB  
    Driver  
    Mongos  
    Routing  Process  
    Replica  Set  
    MongoDB   MongoDB  
    Replica  Set  
    MongoDB   MongoDB  
    16

    View Slide

  17. CIGNEX Datamatics Confidential
    Why  Drupal?
    Pluggable  
    Architecture  
    Data  Abstraction  
    Layer  
    Easy  to  Upgrade  
    Secure  
    Active  Community  
    Widely  Adopted  
    Scalable  
    3rd  Party  Tools    
    Integration  
    User  Management  
    &  Permissions    
    HTML5  &  CSS  
    Support  
    17

    View Slide

  18. CIGNEX Datamatics Confidential
    Websites  using  Drupal  
    Website:  Whitehouse.gov   Website:  Data.gov.uk  
    Website:  mtv.co.uk  
    Website:  
     research.yahoo.com  
    Website:  pdx.edu  
    Website:  
    EndPoverty2015.org  
    18

    View Slide

  19. CIGNEX Datamatics Confidential
    Why  MongoDB?  
    Agile  and  
    Scalable  
    Full  Index  
    Support  
    Document  
    Oriented  Storage  
    Replication  
    Querying  
    Atomic  Updates  
    Data  Processing  
    and  aggregation  
    High  Availability  
    19

    View Slide

  20. CIGNEX Datamatics Confidential
    Customers  using  MongoDB  
    •  Centralized  data  management  platform  
    •  2  billion+  documents  
    •  20  TB  of  photo  metadata    
    •  TV  episodes  and  series  
    •  Risk  solutions  auditing  data  
    Source:  http://www.10gen.com/customers  
    20

    View Slide

  21. CIGNEX Datamatics Confidential
    Demo
     
     
     
    21
    •  Media  site  on  mobile  simulator  
    •  Like  &  comment  on  an  image  on  mobile  simulator  
    •  Mobile  site  on  web  browser  
    •  Verify  ‘Like’  &  comment  of  the  same  image  on  web  browser  
    •  Search  images  &  access  control  
     
     
     

    View Slide

  22. CIGNEX Datamatics Confidential
    Solution    Features
     
    Architecture  and  Design
     
     
    22

    View Slide

  23. CIGNEX Datamatics Confidential
    Architecture
    User  Metadata  
    Indexes  
    Albums  
    Image  
    Metadata  
    GridFS  
    Form  API  
    Drupal  API  
    Custom  
    Module  
    Browser
    /Mobile  
    Theme  
    MongoDB  PHP  
    Driver  
    Menu  API  
    User  
    Mobile  
    Device  
    Image  
    Metadata  
    GridFS  
    23

    View Slide

  24. CIGNEX Datamatics Confidential
    Add  Album  
    Flow
    Add  Image  
    View  Album  
    View  Individual  
     Images  
    Like  Image  
    Comment    
    Image  
    Add  Tags  to    
    Images  
    View  Counter  
    Search  Images  
    By  Tags  
    User  Metadata   Albums   GridFS   Image  Metadata  
    DBRef  
    DBRef   DBRef  
    DBRef  
    MongoDB    Collections  
    24
    User  Actions  

    View Slide

  25. CIGNEX Datamatics Confidential
    Schema  Design  
    User  Metadata   GridFS  
    Albums   Image  Metadata  
    •  User  ID  
    •  DBRef  (Album)  
    •  Tags  
    •  Thumbnail  
    •  Likes  
    •  View  Counter  
    •  Comments  
    •  Permission  
    •  FS.Files  
    •  FS.Chunks  
    •  User  ID  
    •  Tags  
    •  Title  
    •  Make  
    •  Model  
    •  Date  Time  
    •  Aperture  
    •  Exposure  
    •  DBRef  (GridFS)  
    25

    View Slide

  26. CIGNEX Datamatics Confidential
    Schema  Design  
    Image  Metadata  
    Albums  
    User  Metadata  
    26

    View Slide

  27. CIGNEX Datamatics Confidential
    MongoDB  Monitoring  Service  (MMS)    
    27
    •  DB  Storage    
    •  Cursors  
    •  Replica  Sets  
    •  Network  
    Connections  
    •  Non  Mapped  
    Virtual  Memory  
    •  Opcounters  

    View Slide

  28. CIGNEX Datamatics Confidential
    Bene[its
    Drupal   MongoDB  
    Most  advanced  content  management    
    solutions  
    Scalability  –  billions  of  content  items,  
    millions  of  users  
    Highly  customized  websites   Performance  –  FAST  writes  through  
    sharding,  reads  through  indexes  
    Most  search  friendly  CMS   Data  safety  through  replication  
    Less  coding,  high  on  automation   Centralized  single  system  for  data  
    storage    
    Powered  by  7000  plugins  and  
    extensions  
    Monitoring  through  MMS  
     
    Active  community,  real  time  assistance  
     
    Enterprise  support  through  10gen  
     
    28

    View Slide

  29. CIGNEX Datamatics Confidential
    Summary  &  Key  Takeaways
    •  MongoDB  provides  the  RIGHT  [it  for  CMS  applications  with  
    [lexibility,  scale  &  speed
    •  Drupal’s  advanced  &  automated  CMS  features  and  tight  
    integration  with  MongoDB  makes  it  the  right  choice  for  
    building  agile  websites  
    •  Both  Drupal  &  MongoDB  are  feature  rich  and  being  Open  
    Source,  provide  signi[icant  cost  bene[its    
    29

    View Slide

  30. CIGNEX Datamatics Confidential
    Thank  you.  Questions?
    CIGNEX  Datamatics  makes  Open  Source  work  for  you!
     
     
     
     
     
     
     
     
     
     
     
     
    30
    Yash  Badiani  
    Big  Data  Practice  Lead  
    [email protected]  
    Gaurav  Khambhala  
    Technical  Lead  
    [email protected]    

    View Slide