Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL – What’s that.pdf

Sergejus
January 25, 2011

NoSQL – What’s that.pdf

Sergejus

January 25, 2011
Tweet

More Decks by Sergejus

Other Decks in Technology

Transcript

  1. • SQL limitations for storing huge amount of data •

    Key / value / type columns NoSQL – Why?
  2. • 2009, Eric Evans • NoSQL – open source distributed

    databases, not relational SQL databases • NoSQL – not only SQL • NoSQL → Big Data NoSQL History
  3. • A “weaker” concurrency model than the ACID transactions in

    most SQL systems NoSQL Characteristics (BASE)
  4. • Efficient use of distributed indexes and RAM for data

    storage NoSQL Characteristics (distributed)
  5. • The ability to dynamically define new attributes or data

    schema NoSQL Characteristics (schema-less)
  6. • Atomicity – all or nothing • Consistency – state

    integrity • Isolation – no reads of uncommitted data • Durability – recover committed trans ACID (transactions)
  7. • 2000, Eric Brewer • It is impossible for a

    distributed computer system to simultaneously provide all three of the following guarantees: • Consistency • Availability • Partition tolerance CAP Theorem
  8. • Basically – partial system failures are OK Available •

    Soft state – inconsistency is OK • Eventual consistency – stale data is OK BASE (eventual consistency)
  9. • Key / value store • Document database • Graph

    database • Columnar database NoSQL Categories
  10. • <key, value> or Tuple<key, v1,. ., vn> • Simple

    operations • Get • Put • Delete Key / value store Byte[] Byte[] Key Value
  11. • Dynamo* • Membase • Voldermort • Redis • Azure

    Table Storage • Riak Key / value store
  12. Name: Dynamo Created: 2007, Amazon (proprietary) Implementation: ? Distributed: Yes

    Replication: Multiple Servers CAP: AP API: ? Key / value store
  13. Name: Membase Created: 2010, sponsored by Zinga Implementation: C /

    C++ / Erlang Distributed: Yes Replication: Multiple Servers CAP: CP API: Memcached API, JSON Key / value store
  14. Name: Redis Created: 2009, sponsored by VMWare Implementation: C Distributed:

    No Replication: Master / Slave CAP: CP API: Various Languages Key / value store
  15. Name: Azure Table Storage Created: 2008, Microsoft Implementation: ? Distributed:

    Yes Replication: Multiple Servers (DFS) CAP: CP API: .NET API, JSON Key / value store
  16. Name: Riak Created: 2008, Basho (from Akamai) Implementation: Erlang Distributed:

    Yes Replication: Multiple Servers CAP: AP API: JSON Key / value store
  17. • Document == complex object • XML • YAML •

    JSON / BSON • Support for secondary indexes • Schema can be defined at runtime • Optional support for simple querying using Map / Reduce Document database
  18. Name: MongoDB Created: 2008, 10gen Implementation: C++ Distributed: Yes via

    Shards Replication: Master / Slave CAP: CP API: BSON Document database
  19. Name: RavenDB Created: 2010, Ayende Rahien Implementation: C# Distributed: Yes

    via Shards Replication: Master / Master CAP: AP API: .NET API, JSON Document database
  20. • Graph == network • Basic constructs • Node •

    Edge • Properties Graph database sergejus sergejus.blogas.lt tdagys knows knows
  21. Name: FlockDB Created: 2010, Twitter Implementation: Scala Distributed: Yes Replication:

    Multiple Servers CAP: AP API: Thrift, Ruby Graph database
  22. Name: Neo4J Created: 2003, Neo Technologies Implementation: Java Distributed: No

    Replication: Master / Slave CAP: CP API: JSON, Various Languages Graph database
  23. • For HUGE amount of data • Columns are added

    at a runtime • Great scalability • Horizontal • Vertical Columnar database
  24. • Unusual data model • Key Space == Database •

    Column Family == Table • Columns and Super Columns • Super Column == array of Columns • Column == Tuple<Key, Value, Timestamp, TTL> Columnar database
  25. Name: BigTable Created: 2006, Google Implementation: C++ Distributed: Yes Replication:

    Multiple Servers (GFS) CAP: CP API: C++ Columnar database
  26. Name: Cassandra Created: 2008, Facebook Implementation: Java Distributed: Yes Replication:

    Multiple Servers CAP: AP API: Thrift, Avro Columnar database
  27. Name: HBase Created: 2007, Powerset Implementation: Java Distributed: Yes Replication:

    Multiple Servers (HDFS) CAP: CP API: Thrift, Java, JSON Columnar database