Cassandra for PHP Developers

Cassandra for PHP Developers

For too long, PHP developers like yourself had to access Apache Cassandra through the legacy Thrift interface. With the DataStax PHP Driver, finally, the wait is over, full power of Apache Cassandra and CQL is accessible directly in your PHP applications.

In this talk we'll go over Apache Cassandra architecture and get to know the DataStax PHP driver for Apache Cassandra.

F000c9b4dd0656f60de1dc9e75f7386c?s=128

Bulat Shakirzyanov

June 15, 2015
Tweet

Transcript

  1. Cassandra for PHP developers Introduction to Apache Cassandra Overview of

    the DataStax PHP driver Bulat Shakirzyanov @avalanche123
  2. Introduction Cassandra Overview

  3. © 2015 DataStax, All Rights Reserved. Datacenter Datacenter Cassandra Topology

    3 Node Node Node Node Client Client Node Node Node Node Client Client Cluster
  4. © 2015 DataStax, All Rights Reserved. Datacenter Datacenter Request Coordinator

    4 Node Node Node Node Client Client Node Node Coordinator Node Client Client Coordinator node: Forwards requests to corresponding replicas
  5. © 2015 DataStax, All Rights Reserved. Datacenter Row Replica 5

    Replica Node Node Replica Client Client Datacenter Node Node Replica Client Client Coordinator Replica node: Stores a slice of total rows of each keyspace
  6. © 2015 DataStax, All Rights Reserved. Token Ring 6 12

    1 2 3 4 5 6 7 8 9 10 11
  7. © 2015 DataStax, All Rights Reserved. Token Ring 6 -263

    … (+263 - 1) Murmur3 Partitioner
  8. © 2015 DataStax, All Rights Reserved. Token Ring 6 Node

    11…12 Node 12…1 Node 1…2 Node 2…3 Node 3…4 Node 4…5 Node 5…6 Node 6…7 Node 7…8 Node 8…9 Node 9…10 Node 10…11 -263 … (+263 - 1) Murmur3 Partitioner
  9. © 2015 DataStax, All Rights Reserved. Keyspaces 7 CREATE KEYSPACE

    default WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 3 }
  10. © 2015 DataStax, All Rights Reserved. C* Data Partitioning 8

    Keyspace Row token(PK) = 1 RF = 3 Partitioner: Gets a token by hashing the primary key of a row
  11. © 2015 DataStax, All Rights Reserved. C* Replication Strategy 9

    Keyspace 1 Row RF = 3 Replication strategy: Determines the first replica for the row token(PK) = 1
  12. © 2015 DataStax, All Rights Reserved. C* Replication Factor 10

    Keyspace Row RF = 3 Replication factor: Specifies total number of replicas for each row token(PK) = 1
  13. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 11 Replica Application Consistency Level RF = 3, CL = Quorum
  14. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT
  15. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT
  16. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT
  17. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT
  18. © 2015 DataStax, All Rights Reserved. C* Quorum 12 Keyspace

    Row RF = 3 token(PK) = 1 floor(RF / 2) + 1
  19. DataStax PHP Driver Smart client for Apache Cassandra

  20. © 2015 DataStax, All Rights Reserved. Installation 14 git clone

    https://github.com/datastax/php-driver.git cd php-driver pecl install ext/package.xml
  21. © 2015 DataStax, All Rights Reserved. Usage 15 $cluster =

    Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }
  22. © 2015 DataStax, All Rights Reserved. Usage 16 $cluster =

    Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }
  23. © 2015 DataStax, All Rights Reserved. Usage 17 $cluster =

    Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }
  24. © 2015 DataStax, All Rights Reserved. Usage 18 $cluster =

    Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }
  25. © 2015 DataStax, All Rights Reserved. Usage 19 $cluster =

    Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }
  26. © 2015 DataStax, All Rights Reserved. Usage 20 $cluster =

    Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }
  27. © 2015 DataStax, All Rights Reserved. Usage 21 $cluster =

    Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }
  28. Asynchronous Execution IO Reactor and Request Pipelining

  29. © 2015 DataStax, All Rights Reserved. Asynchronous Core 23 Application

    Thread Business Logic Driver Background Thread IO Reactor
  30. © 2015 DataStax, All Rights Reserved. Request Pipelining 24 Client

    Without Request Pipelining Server Client Server With Request Pipelining 1 2 2 3 1 3 1 2 3 1 2 3
  31. © 2015 DataStax, All Rights Reserved. Parallel execution 25 $data

    = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }
  32. © 2015 DataStax, All Rights Reserved. Parallel execution 26 $data

    = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }
  33. © 2015 DataStax, All Rights Reserved. Parallel execution 27 $data

    = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }
  34. © 2015 DataStax, All Rights Reserved. Parallel execution 28 $data

    = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }
  35. Load Balancing Routing and Failover

  36. © 2015 DataStax, All Rights Reserved. Application Driver Load Balancing

    30 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy
  37. © 2015 DataStax, All Rights Reserved. Application Driver Load Balancing

    30 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy
  38. © 2015 DataStax, All Rights Reserved. Application Driver Load Balancing

    30 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy
  39. © 2015 DataStax, All Rights Reserved. Datacenter Datacenter DataCenter Aware

    Balancing 31 Node Node Node Client Node Node Node Client Client Client Client Client Local nodes are queried first, if non are available, the request could be sent to a remote node.
  40. © 2015 DataStax, All Rights Reserved. Token Aware Balancing 32

    Route request directly to Replicas Node Node Replica Node Client Replica Replica Uses prepared statement metadata to get the token
  41. © 2015 DataStax, All Rights Reserved. Latency Aware Balancing 33

    Route requests to the fastest nodes Client Tracks response times for each node
  42. Fault Tolerance Failures and Error Handling

  43. © 2015 DataStax, All Rights Reserved. Fault Tolerance 35 Coordinator

    Node Replica Replica Replica Node Business Logic Driver Application
  44. © 2015 DataStax, All Rights Reserved. 36 Coordinator Node Replica

    Replica Replica Node Business Logic Driver Application Invalid Requests Network Timeouts Server Errors Possible Failures
  45. © 2015 DataStax, All Rights Reserved. Application Driver Automatic Retry

    of Server Errors 37 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy
  46. © 2015 DataStax, All Rights Reserved. Application Driver Automatic Retry

    of Server Errors 37 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy
  47. © 2015 DataStax, All Rights Reserved. Application Driver Automatic Retry

    of Server Errors 37 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy
  48. © 2015 DataStax, All Rights Reserved. 38 Coordinator Node Replica

    Replica Replica Node Business Logic Driver Application Unreachable Consistency
  49. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 39 Replica Business Logic Driver Application Read / Write Timeout Error
  50. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 39 Replica Business Logic Driver Application Read / Write Timeout Error
  51. © 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica

    Node 39 Replica Business Logic Driver Application Read / Write Timeout Error read / write timeout
  52. © 2015 DataStax, All Rights Reserved. 40 Coordinator Node Replica

    Replica Replica Node Business Logic Driver Application Unavailable Error
  53. © 2015 DataStax, All Rights Reserved. 40 Coordinator Node Replica

    Replica Replica Node Business Logic Driver Application Unavailable Error unavailable
  54. © 2015 DataStax, All Rights Reserved. Other • Persistent Sessions

    • Security • Datatypes 41
  55. Questions