Web Scale with NoSQL

Web Scale with NoSQL Sergejus Barinovas (@sergejusb) http://sergejus.blogas.lt

Who Am I?  Architect at  Running NoSQL servers
in production  Blogger (http://sergejus.blogas.lt, @sergejusb)  Community member (http://dotnetgroup.lt)  Contact me via [email protected]

Powered by RDBMS  Used everywhere…  …even where it
shouldn’t  Used for 30+ years!

Back to 1980’s…

Data boom

in numbers  600 000 000 users  30 000
servers  20+ TB raw data per day  >20 PB stored data

You really think they use RDBMS?

RDBMS Scaling Example

Simple usage Customers Reads / Writes master

Scale reads Customers master slave slave

Scale writes Customers [A-M] master master Customers [N-Z]

Scale reads / writes Customers [A-M] master slave slave master
Customers [N-Z] slave slave

Pray your system won’t fail

Why NoSQL  Limited SQL scalability  Sharding and vertical
partitioning  Limited SQL availability  Master / slave configuration  Limited SQL speed of read operations  Multiple read replicas  SQL limitations for huge amount of data  Key / value / type columns

NoSQL history  2009, Eric Evans, no:sql(est)  NoSQL –
open source distributed databases, not relational SQL databases  NoSQL – not only SQL  NoSQL → Big Data

NoSQL characteristics (1/2)  Scalability  The ability to horizontally
scale simple- operation throughput over many servers  BASE  A “weaker” concurrency model than the ACID transactions in most SQL systems

NoSQL characteristics (2/2)  Distributed  Efficient use of distributed
indexes and RAM for data storage  Schema-less  The ability to dynamically define new attributes or data schema

CAP theorem  2000, Eric Brewer  It is impossible
for a distributed computer system to simultaneously provide all three of the following guarantees:  Consistency  Availability  Partition tolerance

NoSQL Databases

NoSQL categories  Key / value store  Document database
 Graph database  Columnar database

Key / value store  <key, value> or Tuple<key, v1,.
., vn>  Simple operations  Get  Put  Delete Byte[] Byte[] Key Value

Key / value store Key Value “current_date” 2013.02.01 “sergejusb” Binary
Object “sergejusb” JSON Object

Key / value stores  Redis  (+)messaging  (-)no
shards  Voldermort  Membase  (+)memcache interface  Riak

Document database  Document == complex object  XML 
YAML  JSON / BSON  Support for secondary indexes  Schema can be defined at runtime  Optional support for simple querying using Map / Reduce

Document databases  MongoDB  (+)shards  CouchDB  (+)master
/ master replication

Graph database  Graph == network  Basic constructs 
Node  Edge  Properties sergejus sergejus.blogas.lt tdagys knows knows

Graph databases  Neo4j  (-)paid version required for scaling
 FlockDB  (+)fast  (-)limited functionality

Columnar database  For HUGE amount of data  Columns
are added at a runtime  Great scalability  Horizontal  Vertical

Columnar database  Unusual data model  Key Space →
Database  Column Family → Table  Columns and Super Columns  Super Column → array of Columns  Column → Tuple<Key, Value, Timestamp, TTL>

Columnar database  Simple column

Columnar database  Cassandra  (+)easy scalable  HBase 
(+)consistent  (+)part of Hadoop  Hypertable

NoSQL is Cool! But…

NoSQL limitations  ORDER BY ?  Natural key order
 GROUP BY ?  Map / Reduce*  JOIN ?  Multiple Map / Reduce*  SELECT * ?  Multi-machine Map / Reduce* *if possible

NoSQL Limitations  Maturity  Tooling  Specificity

SQL vs. NoSQL  Choose the right tool for the
task  You can use BOTH

Thank you! Sergejus Barinovas (@sergejusb) [email protected] http://sergejus.blogas.lt

Web Scale with NoSQL

Web Scale with NoSQL

More Decks by Sergejus

Other Decks in Technology

Featured

Transcript