Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a Big Data Portal with Liferay & MongoDB

mongodb
June 18, 2012
1.4k

Building a Big Data Portal with Liferay & MongoDB

Join us for a webinar featuring how to build a Big Data Portal Platform focused on Rich User Experience and solving the business needs of managing millions of digital assets. The webinar will highlight the integration of Liferay Portal with MongoDB to create a Big Data Portal with features like authentication, secure role based access control, version control, workflow & mobile client interface access.

mongodb

June 18, 2012
Tweet

Transcript

  1. What  Does  CIGNEX  Datamatics  Do?   3   Since  2000,

     CIGNEX  Datamatics  helped  its   clients  save  in  excess  of    US$500  Million  by   leveraging  Commercial  Open  Source   Software  across  200+  implementations.     We  are  experts  at  Open  Source!   2011  Liferay  Partner  of  the  Year  &   Global  Platinum  Partner   Thought  Leader  in  the  Open  Source   Community    and  author  of  technical   resource  guides  
  2. About  the  presenter   •  Yash   Badiani   is

      the   Big   Data   Practice   Lead   at   CIGNEX   Datamatics   and   is   focused  on  Big  Data  Technologies  like  MongoDB  &  Hadoop.     •  He   has   worked   extensively   on   large   Data   warehousing   &   Business   Intelligence  projects  with  tools  like  Business  Objects,  Microsoft  SQL  Server,   Microstrategy  &  IBM  Cognos.     •  Yash  can  be  reached  at  [email protected]   4
  3. Agenda   •  What  is  Big  Data  Portal   • 

    Introduction  to  Portals  &  Liferay   •  Key  challenges  with  content  and  RDBMS   •  Introduction  to  MongoDB   •  Power  of  Liferay  +  Power  of  MongoDB  =  Big  Data  portal   •  Bene[its   •  Solution  Details   –  User  View   –  Administrator  View   –  Developer  View   •  Summary   5
  4.     A  Big  Data  Portal  is  a  web  based

     solution  which   combines  the  powerful  presentation  capabilities  of  a   portal  such  as  rich  user  interface,  collaboration   and  secure  access  with  a  centralized  &  massively   scalable  data  storage  as  the  back  end  consisting  of   a  variety  of  content  (Audio,  Video,  Images,   Documents,  Metadata)  in  large  volumes.     7 What  is  Big  Data  Portal?  
  5. What  are  Portals?   •  A  software  platform  for  building

     websites  and  web   applications   9 End  Users   Business   Organizations   IT  Organizations   Single,  personalized  point     of  access  to  relevant  and     authoritative  information   Uni[ied  place  to  engage,     support,  learn  and     respond  to  customers     Agile,  scalable  web  apps,   Enable  Collaboration,     Delegate  responsibilities  
  6. What  is  Liferay?     •  Enterprise  web  platform  

    for  building  business   solutions   •  Leading  Open  source   portal   –  Strong  community   –  4  million  downloads   –  350,000  –  500,000  deployments   worldwide   –  Leader  in  Gartner’s  Magic   Quadrant  for  Horizontal  Portals   10 Capabilities   •  Built  on  Java  –  Cross  platform  &  Light  weight     •  Content  &  Document  Management  with  MS   Of[ice  Integration   •  Web  Publishing  &  Shared  workspaces   •  Enterprise  collaboration  &  Application   Integration   •  Enterprise  portals  &  Identity  Management   •  Social  Networking  &  Mashup  
  7. Key  challenges  with  content   12 Variety  of  Content  

    Centralization  of  content   Volume   *Reference  Image  
  8. Evolution  in  computing  is  impacting  traditional  RDBMS   14  

    Volume  of  Data   Agile  Development   New  Hardware  Architectures   •  Commodity  servers   •  Cloud  Computing   •  Trillions  of  records   •  100’s  of  millions  of   queries  per  second   •  Iterative   •  Continuous   Source:  10gen  Corp  Overview  
  9. Business  Limitations  of  RDBMS   15 Cost  of  database  increases

      •  Vertical,  not  horizontal,  scaling   •  High  cost  of  SAN   Productivity  decreases   •  Needed  to  add  new  software   layers  of  ORM,  Caching,   Sharding,  Message  Queue   •  Polymorphic,  semi-­‐structured   and  unstructured  data  not  well   supported   Source:  10gen  Corp  Overview  
  10. What  is  MongoDB?   17 •  Open  source,  written  in

     C++   •  Document-­‐oriented  Storage   –  Based  on  JSON  Documents   –  Schema-­‐less   •  Cool  Vendor  –  Information   Infrastructure  and  Big  Data  -­‐  2012   •  Full  featured  indexes,  query   language     •  Replication  &  High  Availability   •  Auto-­‐sharding   MongoDB  is  a  scalable,  high-­‐performance  NoSQL  database.   Source:  10gen  Corp  Overview  
  11. Why  Organizations  should  use  MongoDB?   18 Easy  to  code

     for  increased   Agility   Easy  to  scale  for   performance  &  high   availability   Easy  to  operate,  even  in   the  cloud   Source:  10gen  Corp  Overview  
  12. Solution  :  Combining  Liferay  &  MongoDB   20 Portal  with

       Rich  UI  front  end     Secure  Access  –  Role  Based,  Site  based   Versioning   Search   Locking   Mobile  Access                                                              (Powerful  Back  end)     Structured,  Unstructured  data     Massively  scalable   Highly  reliable  data  storage   Highly  performance   Highly  Flexible-­‐Schema,  Development         Big  Data  PORTAL   Rich  UI  features       Connector           Data  storage   Data  Assets      
  13. Bene[its:  How  MongoDB  enhances  Liferay   • Leverage  Auto  sharding  &

     replica  set  features   • Elasticity  in  scaling  storage  –  go  up  or  down   Scalability   • Commodity  Hardware  –  Eliminates  Network  storage  like  SAN   • Eliminates  need  for  high-­‐end  storage  systems    such  as  EMC  Documentum   Cost  Effectiveness   • Faster  Development   • Easier  Deployment   • Flexible  &  Schema  less   Agility  &  Performance   • GridFS  enables  large  binary  objects  like  Images,  Video  or  Audio   • Simpli[ies  Management  of  data   • Single  system  to  manage  structured  &  unstructured  data     Large  Object  Storage  &  Centralized  Data  Management   22
  14. Bene[its:  How  Liferay  enhances  MongoDB   • Powerful  Websites  consisting  of

      • Gadgets  &  Portlets  –  Portions  of  a  Web  page  that  may  be  a  complete  application   • Pages  &  Themes  –  Common,  Consistent  look  &  feel  across  multiple  pages   • Navigation  –  Menu  bar,  Tabs,  Links   Rich  Front  End   • Role  based   • Site  based   • Login  status  based   Secure  Views  to  data   • Data  access  on  the  go   • Different  Themes  for  Mobile  –  HTML5,  CSS3   Mobile  Integration   • Use  of  Open  standards  ,  Web  services  and  integration  tools   • SOA       Flexible  Architecture  and  Lean  Platform     23
  15. Solution  &  Features   •  News  site   –  CIGNEX

     News  portal  providing  secure  user  interface  to  :   •  Latest  News  articles  &  archives   •  Latest  Videos  &  archives   •  Images  &  archives   –  Features  of  the  portal:   •  Provide  content  authors  to  Add  /  Delete  /  Update  /  Retrieve   documents,  Lock  for  updates  &  version  them   •  Provide  Administrators  to  con[igure  [ine  grained  access  control  to  the   site  –  Role  based,  Site  based,  Folder  /  File  based   •  Provide  Work[low  support  for  content  review  &  [inalization  at  various   levels   –  Scalability  &  Flexibility  in  content  storage  -­‐  Provide  a  scalable  &   [lexible  data  storage  to  scale  for  ever  growing  variety  of  content   25
  16. Solution  &  Features   •  Statistics   –  1M  content

     [iles  uploaded  for  demo   –  Scalable  upto  100s  of  millions   –  Video  [iles  more  than  100  MB  stored  into  GridFS   26
  17. User  View   28 Folders  organizing  the  data  with  View

     /  Edit  privileges  at  each  Folder.    
  18. User  View   File  Level  Edit  Access   File  Level

     View   only    Access   29 Videos  Folder  containing  video  [iles  of  different  type  stored  in  MongoDB  
  19. Administrator  View   Assigning  a  Role(newsrole)  to  a   user(user1)

      Assigning  Rights  to  the  role   Assigning  File  level   permissions  to  role   34
  20. Developer  View  -­‐  Technical  Architecture     35 MongoDB  Connector

      CD  MongoDBFileSystemStore     [Storing  data  in  MongoDB]   Document   Library  Portlet   Store   MySQL   (metadata)   Lucene   (indexing  &   search)   Detailed   MongoDB  (GridFS)  
  21. Developer  View  –  Technologies  &  Components   •  Technologies  used:

      –  liferay-­‐portal-­‐6.1.10-­‐ee-­‐ga1   –  mongodb-­‐linux-­‐x86_64-­‐2.0.4   •  Liferay  Extension  plugin   –  Method  of  extending  Liferay   –  Allows  usage  of  internal  APIs  /  overwriting  [iles  in  Liferay  core   –  Require  server  to  be  restarted  after  development   •  Liferay  portal  con[iguration  [ile   –  portal.properties  –  Main  con[iguration  [ile  for  Liferay  portal.   Contains  detailed  explanation  of  the  properties   –  Portal-­‐ext.properties  –  Used  to  change  the  value  of  any  of  the   properties  de[ined  in  portal.properties     –  Contains  reference  to  the  custom  Implementation  class  &  Mongo   host  &  access  information   36
  22. Developer  View  -­‐  Components   •  Document  Library  portlet  

    –  Central  place  to  aggregate  and  manage  all  content   –  Provides  document  management  backed  by  different  persistence   systems   –  Features  such  as  check  in  /  check  out,  meta  data,  versioning   •  CD  MongoDBFileSystemStore   –  Implementation  of  Liferay  Doc  Library  store  API   –  Signatures  of  all  methods  (add,  update,  view,  delete)   •  MongoDB  Connector   –  Gets  the  Host  information  from  the  portal-­‐ext.properties   –  Uses  the  JAVA  driver  for  data  manipulation  commands   –  Leverages  the  GridFS  API  to  store  large  binary  objects   37
  23. Developer  View  -­‐  Components   •  GridFS     – 

    Speci[ication  for  storing  large  [iles  in  MongoDB   –  Native  storage  of  binary  data  within  BSON  objects  limited  at  16MB   –  Ef[iciently  stores  large  [iles     –  Transparently  divides  large  [iles  among  multiple   documents(chunks)   –  Each  chunk  256k  in  size   –  2  collections:  [iles(stores  the  metadata),  chunks(actual  data)   –  All  drivers  support  GridFS  Implementation  through  API   38
  24. Developer  view  -­‐  Design   •  Design   –  Uses

     Liferay  Extension  plugin  to  develop  new  Document  Library  store  for   MongoDB   –  Uses  Liferay  portal  con[iguration  [ile  to  con[igure  document  library  portlet   to  use  MongoDB  to  store  content   –  Once  con[igured,  all  document  upload  /  download  requests  from   document  library  portlet  are  delegated  to  CD  MongoDBFileSystemStore   –  CD  MongoDBFileSystemStore  uses  MongoDB  java  driver  &  GridFS  Java  API   to  store  or  retrieve  documents  from  MongoDB   –  Java  driver  uses  Mongo  Wire  protocol           39
  25. Summary   •  MongoDB  enables  Portals  for  scalability  (for  huge

     volumes   of  content)  and  [lexibility  (schema-­‐less  content)   •  Liferay’s  rich  user  interface,    content  management,   security,  social  and  mobile  features  compliment   MongoDB’s  powerful  storage  features   •  Big  Data  Portal  with  MongoDB  and  Liferay  provide  lower   TCO  and  higher  ROI  to  enterprises   41
  26. Thank  you.  Questions?   42   CIGNEX  Datamatics  makes  

    Open  Source  work  for  you!     Yash  Badiani   Big  Data  Practice  Lead   [email protected]       Brendan  Coleman   Director  of  Channels   [email protected]     Kristin  Smith   Sales  &  Marketing  Manager   [email protected]