Slide 1

Slide 1 text

Cassandra for PHP developers Introduction to Apache Cassandra Overview of the DataStax PHP driver Bulat Shakirzyanov @avalanche123

Slide 2

Slide 2 text

Introduction Cassandra Overview

Slide 3

Slide 3 text

© 2015 DataStax, All Rights Reserved. Datacenter Datacenter Cassandra Topology 3 Node Node Node Node Client Client Node Node Node Node Client Client Cluster

Slide 4

Slide 4 text

© 2015 DataStax, All Rights Reserved. Datacenter Datacenter Request Coordinator 4 Node Node Node Node Client Client Node Node Coordinator Node Client Client Coordinator node: Forwards requests to corresponding replicas

Slide 5

Slide 5 text

© 2015 DataStax, All Rights Reserved. Datacenter Row Replica 5 Replica Node Node Replica Client Client Datacenter Node Node Replica Client Client Coordinator Replica node: Stores a slice of total rows of each keyspace

Slide 6

Slide 6 text

© 2015 DataStax, All Rights Reserved. Token Ring 6 12 1 2 3 4 5 6 7 8 9 10 11

Slide 7

Slide 7 text

© 2015 DataStax, All Rights Reserved. Token Ring 6 -263 … (+263 - 1) Murmur3 Partitioner

Slide 8

Slide 8 text

© 2015 DataStax, All Rights Reserved. Token Ring 6 Node 11…12 Node 12…1 Node 1…2 Node 2…3 Node 3…4 Node 4…5 Node 5…6 Node 6…7 Node 7…8 Node 8…9 Node 9…10 Node 10…11 -263 … (+263 - 1) Murmur3 Partitioner

Slide 9

Slide 9 text

© 2015 DataStax, All Rights Reserved. Keyspaces 7 CREATE KEYSPACE default WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 3 }

Slide 10

Slide 10 text

© 2015 DataStax, All Rights Reserved. C* Data Partitioning 8 Keyspace Row token(PK) = 1 RF = 3 Partitioner: Gets a token by hashing the primary key of a row

Slide 11

Slide 11 text

© 2015 DataStax, All Rights Reserved. C* Replication Strategy 9 Keyspace 1 Row RF = 3 Replication strategy: Determines the first replica for the row token(PK) = 1

Slide 12

Slide 12 text

© 2015 DataStax, All Rights Reserved. C* Replication Factor 10 Keyspace Row RF = 3 Replication factor: Specifies total number of replicas for each row token(PK) = 1

Slide 13

Slide 13 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 11 Replica Application Consistency Level RF = 3, CL = Quorum

Slide 14

Slide 14 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT

Slide 15

Slide 15 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT

Slide 16

Slide 16 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT

Slide 17

Slide 17 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 11 Replica Application Consistency Level RF = 3, CL = Quorum INSERT

Slide 18

Slide 18 text

© 2015 DataStax, All Rights Reserved. C* Quorum 12 Keyspace Row RF = 3 token(PK) = 1 floor(RF / 2) + 1

Slide 19

Slide 19 text

DataStax PHP Driver Smart client for Apache Cassandra

Slide 20

Slide 20 text

© 2015 DataStax, All Rights Reserved. Installation 14 git clone https://github.com/datastax/php-driver.git cd php-driver pecl install ext/package.xml

Slide 21

Slide 21 text

© 2015 DataStax, All Rights Reserved. Usage 15 $cluster = Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }

Slide 22

Slide 22 text

© 2015 DataStax, All Rights Reserved. Usage 16 $cluster = Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }

Slide 23

Slide 23 text

© 2015 DataStax, All Rights Reserved. Usage 17 $cluster = Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }

Slide 24

Slide 24 text

© 2015 DataStax, All Rights Reserved. Usage 18 $cluster = Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }

Slide 25

Slide 25 text

© 2015 DataStax, All Rights Reserved. Usage 19 $cluster = Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }

Slide 26

Slide 26 text

© 2015 DataStax, All Rights Reserved. Usage 20 $cluster = Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }

Slide 27

Slide 27 text

© 2015 DataStax, All Rights Reserved. Usage 21 $cluster = Cassandra::cluster() // connects to localhost by default ->build(); $keyspace = 'system'; $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements 'SELECT keyspace_name, columnfamily_name ' . 'FROM schema_columnfamilies' ); $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution $result = $future->get(); // wait for the result, with an optional timeout foreach ($result as $row) { // results and rows implement Iterator, Countable // ArrayAccess printf("The keyspace %s has a table %s\n", $row['keyspace_name'], $row['columnfamily_name']); }

Slide 28

Slide 28 text

Asynchronous Execution IO Reactor and Request Pipelining

Slide 29

Slide 29 text

© 2015 DataStax, All Rights Reserved. Asynchronous Core 23 Application Thread Business Logic Driver Background Thread IO Reactor

Slide 30

Slide 30 text

© 2015 DataStax, All Rights Reserved. Request Pipelining 24 Client Without Request Pipelining Server Client Server With Request Pipelining 1 2 2 3 1 3 1 2 3 1 2 3

Slide 31

Slide 31 text

© 2015 DataStax, All Rights Reserved. Parallel execution 25 $data = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }

Slide 32

Slide 32 text

© 2015 DataStax, All Rights Reserved. Parallel execution 26 $data = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }

Slide 33

Slide 33 text

© 2015 DataStax, All Rights Reserved. Parallel execution 27 $data = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }

Slide 34

Slide 34 text

© 2015 DataStax, All Rights Reserved. Parallel execution 28 $data = array( array(41, 'Sam'), array(35, 'Bob') ); $session = $cluster->connect("mykeyspace"); $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?"); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $options = new ExecutionOptions(array('arguments' => $arguments)); $futures[]= $session->executeAsync($statement, $options); } // wait for all statements to complete foreach ($futures as $future) { // we will not wait for each result for more than 5 seconds $future->get(5); }

Slide 35

Slide 35 text

Load Balancing Routing and Failover

Slide 36

Slide 36 text

© 2015 DataStax, All Rights Reserved. Application Driver Load Balancing 30 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy

Slide 37

Slide 37 text

© 2015 DataStax, All Rights Reserved. Application Driver Load Balancing 30 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy

Slide 38

Slide 38 text

© 2015 DataStax, All Rights Reserved. Application Driver Load Balancing 30 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy

Slide 39

Slide 39 text

© 2015 DataStax, All Rights Reserved. Datacenter Datacenter DataCenter Aware Balancing 31 Node Node Node Client Node Node Node Client Client Client Client Client Local nodes are queried first, if non are available, the request could be sent to a remote node.

Slide 40

Slide 40 text

© 2015 DataStax, All Rights Reserved. Token Aware Balancing 32 Route request directly to Replicas Node Node Replica Node Client Replica Replica Uses prepared statement metadata to get the token

Slide 41

Slide 41 text

© 2015 DataStax, All Rights Reserved. Latency Aware Balancing 33 Route requests to the fastest nodes Client Tracks response times for each node

Slide 42

Slide 42 text

Fault Tolerance Failures and Error Handling

Slide 43

Slide 43 text

© 2015 DataStax, All Rights Reserved. Fault Tolerance 35 Coordinator Node Replica Replica Replica Node Business Logic Driver Application

Slide 44

Slide 44 text

© 2015 DataStax, All Rights Reserved. 36 Coordinator Node Replica Replica Replica Node Business Logic Driver Application Invalid Requests Network Timeouts Server Errors Possible Failures

Slide 45

Slide 45 text

© 2015 DataStax, All Rights Reserved. Application Driver Automatic Retry of Server Errors 37 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy

Slide 46

Slide 46 text

© 2015 DataStax, All Rights Reserved. Application Driver Automatic Retry of Server Errors 37 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy

Slide 47

Slide 47 text

© 2015 DataStax, All Rights Reserved. Application Driver Automatic Retry of Server Errors 37 Application Thread Node Pool Session Pool Pool Pool Application Thread Application Thread Client Cluster Node Node Node Load Balancing Policy

Slide 48

Slide 48 text

© 2015 DataStax, All Rights Reserved. 38 Coordinator Node Replica Replica Replica Node Business Logic Driver Application Unreachable Consistency

Slide 49

Slide 49 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 39 Replica Business Logic Driver Application Read / Write Timeout Error

Slide 50

Slide 50 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 39 Replica Business Logic Driver Application Read / Write Timeout Error

Slide 51

Slide 51 text

© 2015 DataStax, All Rights Reserved. Coordinator Node Replica Replica Node 39 Replica Business Logic Driver Application Read / Write Timeout Error read / write timeout

Slide 52

Slide 52 text

© 2015 DataStax, All Rights Reserved. 40 Coordinator Node Replica Replica Replica Node Business Logic Driver Application Unavailable Error

Slide 53

Slide 53 text

© 2015 DataStax, All Rights Reserved. 40 Coordinator Node Replica Replica Replica Node Business Logic Driver Application Unavailable Error unavailable

Slide 54

Slide 54 text

© 2015 DataStax, All Rights Reserved. Other • Persistent Sessions • Security • Datatypes 41

Slide 55

Slide 55 text

Questions