Beautiful Riak

14fbbf9c707c96558d6515e038bf17b7?s=47 vonstark
October 24, 2012

Beautiful Riak

Riak is a NoSQL from Basho that have many features. One of the most powerful and also my favorite is scalability. Riak also bring you many powerful search, for example you can use secondary index to "tag a data", so next time you can use tag to search instead of key....

14fbbf9c707c96558d6515e038bf17b7?s=128

vonstark

October 24, 2012
Tweet

Transcript

  1. 2.

    Me Co-Founder @ NeoArk I use : ruby, sql, neo4j,

    riak Wednesday, October 24, 12
  2. 5.

    SQL IS COOL NOSQL IS JUST : NOT ONLY SQL

    Wednesday, October 24, 12
  3. 7.

    About Basho Founded Jan 2008 Eric Brewer (father of CAP

    theorem) is the director of board They are agile Wednesday, October 24, 12
  4. 9.

    Design Masterless, Scale predictably , easily , simplifies Flexible and

    powerful Fault Tolerant High Available Partitions More nodes, more fast ( powerful ) Wednesday, October 24, 12
  5. 10.

    Details Key-Value OpenSource Meta Data Vector Clock Bucket Full Text

    Search, Secondary Indexes, Link Walk , Map/Reduce Restful / Protocol Buffer Interface Wednesday, October 24, 12
  6. 11.

    Use it as... Hybrid Database Main Database Cache Cross Machine

    Session Keeper Wednesday, October 24, 12
  7. 13.

    Erlang Erlang is a programming language used to build massively

    scalable soft real-time systems with requirements on high availability. Some of its uses are in telecoms, banking, e-commerce, computer telephony and instant messaging. Erlang's runtime system has built-in support for concurrency, distribution and fault tolerance. OTP is set of Erlang libraries and design principles providing middle- ware to develop these systems. It includes its own distributed database, applications to interface towards other languages, debugging and release handling tools. Wednesday, October 24, 12
  8. 14.

    Because Erlang, so.... Erlang/OTP provides an ideal platform for developing

    systems like Riak because it provides inter-node communication, message queues, failure detectors, and client-server abstractions out of the box. What's more, most frequently-used patterns in Erlang have been implemented in library modules, commonly referred to as OTP behaviors. They contain the generic code framework for concurrency and error handling, simplifying concurrent programming and protecting the developer from many common pitfalls. Behaviors are monitored by supervisors, themselves a behavior, and grouped together in supervision trees. A supervision tree is packaged in an application, creating a building block of an Erlang program. A complete Erlang system such as Riak is a set of loosely coupled applications that interact with each other. Some of these applications have been written by the developer, some are part of the standard Erlang/OTP distribution, and some may be other open source components. They are sequentially loaded and started by a boot script generated from a list of applications and versions. HTTP://WWW.AOSABOOK.ORG/EN/RIAK.HTML Wednesday, October 24, 12
  9. 17.

    Clustering 160-bit integer space 32 partitions Each Node is :

    equal size, partition, read/write able Dynamic add / remove Node without any reboot, config Consistent Hash Wednesday, October 24, 12
  10. 20.
  11. 23.

    Clients & connect Support : Ruby, Java, Erlang, PHP, Python,

    C/C++, Javascript... and more Wednesday, October 24, 12
  12. 24.

    Bucket It’s a collection of keys It’s not a table

    Please do not search entire me (bucket) Wednesday, October 24, 12
  13. 27.

    Riak Basic Query Query : Get /riak/bucket/key Store : Put

    /riak/bucket/key -Content-Type -X-Riak-Vclock Delete : Delete /riak/bucket/key Example key : member-1 Wednesday, October 24, 12
  14. 28.

    Riak Secondary Index tag an object with 1 or more

    values type : Integer, String Search can be exact match or range results can be the input of MapReduce Wednesday, October 24, 12
  15. 29.

    Riak Full Text Search Built on RiakCore Advanced search tool,

    Concise , Easy to use Various mine types : JSON, XML, plain text, Erlang, Erlang binaries Various analyzers : white space, integer, no-op... Queries : Wildcards, In/Exclude and/or/not range, Grouping, Prefix match, Proximity search, Term boosting Scoring or Ranking for results Can be the the input of Map/Reduce Wednesday, October 24, 12
  16. 31.

    Mapper Input the key-data ( regular expression available) Return a

    list of values Parallel results are aggregated into a single list What you can do with it ? Count words, Extract data Wednesday, October 24, 12
  17. 32.

    Reducer Input the list of results Merge results Two processes

    per reducer What you can do with it ? Sort, Aggregate Wednesday, October 24, 12
  18. 33.

    Mapper Example inputs : [[“map_demo”,”key1.txt”], [“map_demo”,”key2.txt”], [“map_demo”,”key3.txt”]] source : function(){

    var m = v.values[0].data.match(/demo/gi); return [ [v.key, m.length] ]; } Results : [ [“key3.txt”,5],[“key2.txt”,1],[“key1.txt”,2] ] Wednesday, October 24, 12
  19. 34.

    Reducer Example inputs : [ [“key3.txt”,5],[“key2.txt”,1],[“key1.txt”,2] ] source : function(){

    var flattened = values.reduce( function(a,b){ return a.concat(b); }, [] ); return flattened.sort( function(a,b){ return b[\”length\”] - a[\”length\”] } ); } Results : [ {“key”:”key3.txt”,”length”:5}, {“key”:”key1.txt”,”length”:2}, {“key”:”key2.txt”,”length”:1} ] Wednesday, October 24, 12
  20. 36.

    Hadoop (similarities) Provide data locality Phases runs beside data Distributed

    across multiple machines Wednesday, October 24, 12
  21. 37.

    Hadoop (differences) Used for large , long-run jobs Restart failed

    tasks 3 phases ( map, combine, reduce ) Wednesday, October 24, 12
  22. 38.

    CouchDB (differences) Computes cached views for lookups Runs over all

    docs in a database Not distributed across multiple machines No query time arguments 2 phase (map,reduce) Wednesday, October 24, 12
  23. 39.

    MongoDB (differences) Not run in parallel Not spread across multiple

    machines 3 phases ( map, reduce, finalize) Wednesday, October 24, 12
  24. 41.

    Riak-Control (Rekon) Restart, Start, Shutdown Nodes Snapshot Ring Status Lookup

    Objects Run Map/Reduce Live Graph Console HTTPS://GITHUB.COM/BASHO/RIAK_CONTROL Wednesday, October 24, 12
  25. 44.

    Add Node riak-admin cluster join dev1@127.0.0.1 ======================= Membership ======================== Status

    Ring Pending Node ----------------------------------------------------------- valid 25.0% -- 'dev1@127.0.0.1' valid 25.0% -- 'dev2@127.0.0.1' valid 25.0% -- 'dev3@127.0.0.1' valid 25.0% -- 'dev4@127.0.0.1' ----------------------------------------------------------- Valid:4 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 Wednesday, October 24, 12
  26. 45.

    Remove Node riak-admin cluster leave dev1@127.0.0.1 ======================= Membership ======================== Status

    Ring Pending Node ----------------------------------------------------------- valid 33.3% -- 'dev2@127.0.0.1' valid 33.3% -- 'dev3@127.0.0.1' valid 33.3% -- 'dev4@127.0.0.1' ----------------------------------------------------------- Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 Wednesday, October 24, 12
  27. 47.

    We Are Hiring ! Looking for : Mobile developers Web

    developers Part/Full Time, others.. Flexible office time We Use : Git Ruby, ObjC, Java MySQL, NoSQL AWS Wednesday, October 24, 12
  28. 48.