Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Modification of Social Networking platform, Elgg to work with NoSQL Databases

Mayank Khanna
March 14, 2011
210

Modification of Social Networking platform, Elgg to work with NoSQL Databases

Description of a Proof of Concept to migrate to NoSQL Databases

Mayank Khanna

March 14, 2011
Tweet

Transcript

  1.  Social networking framework helps net savvies build their community

    websites.  Elgg is a social networking framework which provides a modular system consisting of different pieces required for a fully featured social networking platform.  Currently, Elgg is structured around Relational Databases which do not scale well for unique data patterns of social networking platform content .  Non relational Databases are not bound to any structure or schema , unlike Relational Databases , it stores data in the form of key-value pairs .  Every record is self-contained and addressed by a unique key. Hence, it scales well with millions of simultaneous Database reads and writes, thus making it simpler and faster . Scalability Of Blog Plug-in Using Key-Value Stores
  2.  The Benefitters: • End users , Clients of the

    website • Server administrator Scalability Of Blog Plug-in Using Key-Value Stores  The Benefits: • Lesser downtime • Easy server administration  Alternatives: • SimpleDB: Non-Relational Datastore developed by Amazon.com • SQLite: Flat-File Database developed by D. Richard Hipp. • MongoDB: Document-oriented Database developed by 10gen.
  3. Elgg It is an open source social networking software that

    provides individuals and organizations with the components needed to create an online social environment. It is a Relational Database Management System that runs as a server. MySQL Scalability Of Blog Plug-in Using Key-Value Stores NoSQL Database It is a structured storage used to designate DBMS that differ from classical Relational DBMS. They do not use fixed table Schemas . CouchDB It is a Document-oriented Database Server which is schema-free and with a flat address space. Scalability It is a desirable property of a system, networks or a process ,which indicates it’s ability to handle growing amounts of work in a graceful manner . RESTful REST means Representational State Transfer. It is an architectural style that drives the web.
  4. Scalability Of Blog Plug-in Using Key-Value Stores  Rucku: It

    is the web’s largest Social Network dedicated to Rugby.  Teach-box: It is a resource sharing network for School Teachers.  Planet Red: It is a Social Network for the University of Nebraska-Lincoln.
  5.  Elgg provides an application developer a modular system consisting

    of different pieces required for a fully featured Social Networking Platform.  Currently, Elgg is structured around the relational database, MySQL.  While this helps with a clean and crisp representation of structured data, it does not scale well with millions of simultaneous database reads and writes. Scalability Of Blog Plug-in Using Key-Value Stores
  6.  Hardware:  Memory: 20 GB  RAM: 128 MB

     Processor: 700 MHz x86 processor  Software:  Linux O.S. (Ubuntu ver. 10.04)  Apache Web Server  MySql Database Server  PHP (HyperText PreProcesssor)  CouchDB Database Server Scalability Of Blog Plug-in Using Key-Value Stores
  7.  One of the major problems with NoSQL Databases is

    that they are not Fully Featured.  Several applications perform queries which involve joining operations on several related tables. Such operations are not possible with NoSQL Databases adding to it’s drawbacks. SELECT * FROM student; Scalability Of Blog Plug-in Using Key-Value Stores
  8.  Add a Blog Post to the NoSql Datastore 

    Add an entry into the MySQL Database  Viewing all the Blog Posts present in the Datastore  Administrative Tool to enable the Blog Plugin 13 Scalability Of Blog Plug-in Using Key-Value Stores  Scalability  Faster Performance even at times when there are several data requests  Availability  Usability- Using Futon one can edit the databases  Easy Data Replication
  9. User B r o w s e r Elgg Mod_php

    Apache CouchDB MySQL 14 Scalability Of Blog Plug-in Using Key-Value Stores
  10. CouchDB Login No Blogs Displayed Display Blog Posts MySQL Display

    on Browser 15 Scalability Of Blog Plug-in Using Key-Value Stores
  11. 17 Scalability Of Blog Plug-in Using Key-Value Stores Blog Document

    deleted from CouchDB Delete Blog Owner_guid, Blog_id deleted from MySQL
  12. Publish Login Write Blog Display Commit to DB 20 Scalability

    Of Blog Plug-in Using Key-Value Stores
  13. 10 Scalability Of Blog Plug-in Using Key-Value Stores User Logs

    into Site Get Input Data Check for NULL Title and Description Initialize Blog Array Save the Blog Go to Main Blog Page Post Success Message 21
  14. 22 Scalability Of Blog Plug-in Using Key-Value Stores Connect to

    MySQL and Select DB Get UUID from CouchDB Save and Close Databases Encode Blog Array into JSON Object Insert JSON Object into CouchDB Obtain Ownerid, Blogid and Insert into MySQL
  15. 23 Scalability Of Blog Plug-in Using Key-Value Stores 10 Connect

    to MySQL and Select DB Get Owner Id Perform Query to Get Blogid for Corresponding Id Pass the above to Elgg View Function Retrieve Blogs for the Blogids Obtained from CouchDB
  16. 24 Scalability Of Blog Plug-in Using Key-Value Stores Delete Blog

    Document from CouchDB Fetch Existing Blog from CouchDB Delete Blog Go to Main Blog Page Delete Owner_guid, Blog_id from MySQL
  17. 25 Scalability Of Blog Plug-in Using Key-Value Stores Create New

    User Update Blog Enable Blog Plugin Site Administration Set Site Categories Add Blog Publish Blog View Blog Delete Blog
  18. User Elgg Site Logs in Adds Blog/ Requests to view/delete

    Blog Viewed/ deleted By MySQL Allows/Denies owner_guid, CouchDB blog_id Content/blog_id Blog Content 26 Scalability Of Blog Plug-in Using Key-Value Stores
  19.  Connect to the MySQL server and select the appropriate

    database for the operations to follow (elgg_db in this case).  Include the file to save the blog.  Use the gatekeeper( ) function to make sure that only the logged in users can add the blog post.  Get the input data from the form where the blog post is written by user.  Making sure that the title and description of the blog are not null.  Creation of the blog associative array with the necessary key-value pairs which includes type, subtype, owner_guid, container_guid, access_id etc.  Call the save function and pass the blog array as the parameter to it.  Forward to the main blog page to view the list of blog posts. 27 Scalability Of Blog Plug-in Using Key-Value Stores
  20.  Connect to the MySQL server and select the appropriate

    database.  Establish the connection to the CouchDB server using the phpcurl function, curl_init ( ) which returns the curl handler.  Using the curl handler, obtain the uuid for the blog post from CouchDB.  Convert the blog array into a json object (json_encode( )) and insert it into CouchDB’s database with the id obtained previously.  Store the blog id and the owner id of the blog in elgg_db database in MySQL.  Close the MySQL database.  Close the curl handler.  Return the blog which is stored as a document in CouchDB’s database to the add module. 28 Scalability Of Blog Plug-in Using Key-Value Stores
  21.  Begin the function elgg_get_entities by passing $options as parameter.

     Connect to MySQL database and select the database used by the site (elgg_db).  Get the container_guid from the $options array.  Select the blog ids of a user from the MySQL table and assign it to $result.  As long as there are entries in $result array perform the following:  Assign the id obtained from the above array to $row variable.  Establish the curl session with the url specifying the document of interest.  Use the HTTPGET curl option to get the document from CouchDB.  Decode the json object returned from CouchDB to access its contents  Create a new Elgg object and assign all the decoded fields obtained from CouchDB to the Elgg object.  Repeat the process for every document obtained and assign them to an array.  Return array to $entities variable in elgg_list_entities which passes it to the elgg view function. 29 Scalability Of Blog Plug-in Using Key-Value Stores
  22. • Get the blog id from the blogpost to be

    deleted • Connect to the CouchDB server using the curl init function of php curl which returns the curl handler. • Specify the blog id in the url to connect to CouchDB. • Access the document of interest and retrieve it’s revision i.e. ‘_rev’. • Create a new curl session to delete document and set CUSTOMREQUEST option . • Execute the curl session to delete the file. • Connect to the MySQL server and delete the appropriate entry from the elgg_db table elgg_couch. • The blog is deleted and will not be displayed when the user views the blogs. 30 Scalability Of Blog Plug-in Using Key-Value Stores
  23.  Test Scenario 1: Adding a Blog 31 Scalability Of

    Blog Plug-in Using Key-Value Stores Sl. No. Test case Test Input Expected Result Actual Result 1 Add a Blogpost Blog id, Blog Content Go to Main Blog Page. Blogpost gets added in CouchDB Go to Main Blog Page. Blogpost gets added in CouchDB (Result: Pass)  Test Scenario 2: Saving a Blog Sl. No. Test case Test Input Expected Result Actual Result 1 Save Blogpost Blog id, Blog Content Blog Document of particular uuid viewed in Couchdb Blog Document of particular uuid viewed in Couchdb
  24.  Test Scenario 3: Viewing a Blog(Failure) 32 Scalability Of

    Blog Plug-in Using Key-Value Stores Sl. No. Test case Test Input Expected Result Actual Result 1 View Blogpost (Atleast one saved Blogpost) Guid of the User View all Blogposts of User No Blogs Displayed (Result: Fail)
  25.  Test Scenario 4: Viewing a Blog 33 Scalability Of

    Blog Plug-in Using Key-Value Stores Sl. No. Test case Test Input Expected Result Actual Result 1 View Blogpost (Atleast one saved Blogpost) Guid of the User View all Blogposts of User View all Blogposts of User 2 View Blogpost (No Saved Blogpost) Guid of the User No Blogs Displayed No Blogs Displayed
  26.  Test Scenario 5: Deleting a Blog 34 Scalability Of

    Blog Plug-in Using Key-Value Stores Sl. No. Test case Test Input Expected Result Actual Result 1 Delete Blogpost Blog id of blog to be deleted Entry deleted from MySQL. Blog Document deleted from CouchDB Entry deleted from MySQL. Blog Document deleted from CouchDB (Result: Pass)
  27.  The whole Elgg Framework can be modified in a

    similar manner such that the whole framework relies on the NoSQL datastore, CouchDB for its data storage as well retrieval purposes.  Communication Interface between the Framework and CouchDB server can be made more Generic.  Additional features can be incorporated in the Blog Plug- in wherein one can view the blogs based on tags, dates, latest comments, etc.  Once the whole Framework gets modified Elgg will totally reject usage of the Relational Database, MySQL and will perform all its functionality through CouchDB. 35 Scalability Of Blog Plug-in Using Key-Value Stores
  28. Total No. Of Days:98 Requirements Specification Design Development Integration &

    Testing Scalability Of Blog Plug-in Using Key-Value Stores