Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cassandra for PHP Developers

Cassandra for PHP Developers

For too long, PHP developers like yourself had to access Apache Cassandra through the legacy Thrift interface. With the DataStax PHP Driver, finally, the wait is over, full power of Apache Cassandra and CQL is accessible directly in your PHP applications.

In this talk we'll go over Apache Cassandra architecture and get to know the DataStax PHP driver for Apache Cassandra.

Bulat Shakirzyanov

June 15, 2015
Tweet

More Decks by Bulat Shakirzyanov

Other Decks in Technology

Transcript

  1. Cassandra for PHP developers
    Introduction to Apache Cassandra
    Overview of the DataStax PHP driver
    Bulat Shakirzyanov
    @avalanche123

    View full-size slide

  2. Introduction
    Cassandra Overview

    View full-size slide

  3. © 2015 DataStax, All Rights Reserved.
    Datacenter Datacenter
    Cassandra Topology
    3
    Node
    Node
    Node
    Node
    Client Client
    Node
    Node
    Node
    Node
    Client Client
    Cluster

    View full-size slide

  4. © 2015 DataStax, All Rights Reserved.
    Datacenter Datacenter
    Request Coordinator
    4
    Node
    Node
    Node
    Node
    Client Client
    Node
    Node
    Coordinator
    Node
    Client Client
    Coordinator node:
    Forwards requests
    to corresponding replicas

    View full-size slide

  5. © 2015 DataStax, All Rights Reserved.
    Datacenter
    Row Replica
    5
    Replica
    Node
    Node
    Replica
    Client Client
    Datacenter
    Node
    Node
    Replica
    Client Client
    Coordinator
    Replica node:
    Stores a slice of total rows
    of each keyspace

    View full-size slide

  6. © 2015 DataStax, All Rights Reserved.
    Token Ring
    6
    12
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11

    View full-size slide

  7. © 2015 DataStax, All Rights Reserved.
    Token Ring
    6
    -263 … (+263 - 1)
    Murmur3 Partitioner

    View full-size slide

  8. © 2015 DataStax, All Rights Reserved.
    Token Ring
    6
    Node
    11…12
    Node
    12…1
    Node
    1…2
    Node
    2…3
    Node
    3…4
    Node
    4…5
    Node
    5…6
    Node
    6…7
    Node
    7…8
    Node
    8…9
    Node
    9…10
    Node
    10…11
    -263 … (+263 - 1)
    Murmur3 Partitioner

    View full-size slide

  9. © 2015 DataStax, All Rights Reserved.
    Keyspaces
    7
    CREATE KEYSPACE default WITH replication = {
    'class': 'SimpleStrategy',
    'replication_factor': 3
    }

    View full-size slide

  10. © 2015 DataStax, All Rights Reserved.
    C*
    Data Partitioning
    8
    Keyspace
    Row
    token(PK) = 1
    RF = 3
    Partitioner:
    Gets a token by hashing
    the primary key of a row

    View full-size slide

  11. © 2015 DataStax, All Rights Reserved.
    C*
    Replication Strategy
    9
    Keyspace 1
    Row
    RF = 3
    Replication strategy:
    Determines the first
    replica for the row
    token(PK) = 1

    View full-size slide

  12. © 2015 DataStax, All Rights Reserved.
    C*
    Replication Factor
    10
    Keyspace
    Row
    RF = 3
    Replication factor:
    Specifies total number of
    replicas for each row
    token(PK) = 1

    View full-size slide

  13. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    11
    Replica
    Application
    Consistency Level
    RF = 3, CL = Quorum

    View full-size slide

  14. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    11
    Replica
    Application
    Consistency Level
    RF = 3, CL = Quorum
    INSERT

    View full-size slide

  15. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    11
    Replica
    Application
    Consistency Level
    RF = 3, CL = Quorum
    INSERT

    View full-size slide

  16. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    11
    Replica
    Application
    Consistency Level
    RF = 3, CL = Quorum
    INSERT

    View full-size slide

  17. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    11
    Replica
    Application
    Consistency Level
    RF = 3, CL = Quorum
    INSERT

    View full-size slide

  18. © 2015 DataStax, All Rights Reserved.
    C*
    Quorum
    12
    Keyspace
    Row
    RF = 3
    token(PK) = 1
    floor(RF / 2) + 1

    View full-size slide

  19. DataStax PHP Driver
    Smart client for Apache Cassandra

    View full-size slide

  20. © 2015 DataStax, All Rights Reserved.
    Installation
    14
    git clone https://github.com/datastax/php-driver.git
    cd php-driver
    pecl install ext/package.xml

    View full-size slide

  21. © 2015 DataStax, All Rights Reserved.
    Usage
    15
    $cluster = Cassandra::cluster() // connects to localhost by default
    ->build();
    $keyspace = 'system';
    $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace
    $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements
    'SELECT keyspace_name, columnfamily_name ' .
    'FROM schema_columnfamilies'
    );
    $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution
    $result = $future->get(); // wait for the result, with an optional timeout
    foreach ($result as $row) { // results and rows implement Iterator, Countable
    // ArrayAccess
    printf("The keyspace %s has a table %s\n",
    $row['keyspace_name'],
    $row['columnfamily_name']);
    }

    View full-size slide

  22. © 2015 DataStax, All Rights Reserved.
    Usage
    16
    $cluster = Cassandra::cluster() // connects to localhost by default
    ->build();
    $keyspace = 'system';
    $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace
    $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements
    'SELECT keyspace_name, columnfamily_name ' .
    'FROM schema_columnfamilies'
    );
    $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution
    $result = $future->get(); // wait for the result, with an optional timeout
    foreach ($result as $row) { // results and rows implement Iterator, Countable
    // ArrayAccess
    printf("The keyspace %s has a table %s\n",
    $row['keyspace_name'],
    $row['columnfamily_name']);
    }

    View full-size slide

  23. © 2015 DataStax, All Rights Reserved.
    Usage
    17
    $cluster = Cassandra::cluster() // connects to localhost by default
    ->build();
    $keyspace = 'system';
    $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace
    $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements
    'SELECT keyspace_name, columnfamily_name ' .
    'FROM schema_columnfamilies'
    );
    $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution
    $result = $future->get(); // wait for the result, with an optional timeout
    foreach ($result as $row) { // results and rows implement Iterator, Countable
    // ArrayAccess
    printf("The keyspace %s has a table %s\n",
    $row['keyspace_name'],
    $row['columnfamily_name']);
    }

    View full-size slide

  24. © 2015 DataStax, All Rights Reserved.
    Usage
    18
    $cluster = Cassandra::cluster() // connects to localhost by default
    ->build();
    $keyspace = 'system';
    $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace
    $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements
    'SELECT keyspace_name, columnfamily_name ' .
    'FROM schema_columnfamilies'
    );
    $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution
    $result = $future->get(); // wait for the result, with an optional timeout
    foreach ($result as $row) { // results and rows implement Iterator, Countable
    // ArrayAccess
    printf("The keyspace %s has a table %s\n",
    $row['keyspace_name'],
    $row['columnfamily_name']);
    }

    View full-size slide

  25. © 2015 DataStax, All Rights Reserved.
    Usage
    19
    $cluster = Cassandra::cluster() // connects to localhost by default
    ->build();
    $keyspace = 'system';
    $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace
    $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements
    'SELECT keyspace_name, columnfamily_name ' .
    'FROM schema_columnfamilies'
    );
    $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution
    $result = $future->get(); // wait for the result, with an optional timeout
    foreach ($result as $row) { // results and rows implement Iterator, Countable
    // ArrayAccess
    printf("The keyspace %s has a table %s\n",
    $row['keyspace_name'],
    $row['columnfamily_name']);
    }

    View full-size slide

  26. © 2015 DataStax, All Rights Reserved.
    Usage
    20
    $cluster = Cassandra::cluster() // connects to localhost by default
    ->build();
    $keyspace = 'system';
    $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace
    $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements
    'SELECT keyspace_name, columnfamily_name ' .
    'FROM schema_columnfamilies'
    );
    $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution
    $result = $future->get(); // wait for the result, with an optional timeout
    foreach ($result as $row) { // results and rows implement Iterator, Countable
    // ArrayAccess
    printf("The keyspace %s has a table %s\n",
    $row['keyspace_name'],
    $row['columnfamily_name']);
    }

    View full-size slide

  27. © 2015 DataStax, All Rights Reserved.
    Usage
    21
    $cluster = Cassandra::cluster() // connects to localhost by default
    ->build();
    $keyspace = 'system';
    $session = $cluster->connect($keyspace); // create session, optionally scoped to a keyspace
    $statement = new Cassandra\SimpleStatement( // also supports prepared and batch statements
    'SELECT keyspace_name, columnfamily_name ' .
    'FROM schema_columnfamilies'
    );
    $future = $session->executeAsync($statement); // fully asynchronous and easy parallel execution
    $result = $future->get(); // wait for the result, with an optional timeout
    foreach ($result as $row) { // results and rows implement Iterator, Countable
    // ArrayAccess
    printf("The keyspace %s has a table %s\n",
    $row['keyspace_name'],
    $row['columnfamily_name']);
    }

    View full-size slide

  28. Asynchronous Execution
    IO Reactor and Request Pipelining

    View full-size slide

  29. © 2015 DataStax, All Rights Reserved.
    Asynchronous Core
    23
    Application Thread
    Business Logic
    Driver
    Background Thread
    IO Reactor

    View full-size slide

  30. © 2015 DataStax, All Rights Reserved.
    Request Pipelining
    24
    Client
    Without
    Request Pipelining
    Server
    Client Server
    With
    Request Pipelining
    1
    2
    2
    3
    1
    3
    1
    2
    3
    1
    2
    3

    View full-size slide

  31. © 2015 DataStax, All Rights Reserved.
    Parallel execution
    25
    $data = array(
    array(41, 'Sam'),
    array(35, 'Bob')
    );
    $session = $cluster->connect("mykeyspace");
    $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?");
    $futures = array();
    // execute all statements in background
    foreach ($data as $arguments) {
    $options = new ExecutionOptions(array('arguments' => $arguments));
    $futures[]= $session->executeAsync($statement, $options);
    }
    // wait for all statements to complete
    foreach ($futures as $future) {
    // we will not wait for each result for more than 5 seconds
    $future->get(5);
    }

    View full-size slide

  32. © 2015 DataStax, All Rights Reserved.
    Parallel execution
    26
    $data = array(
    array(41, 'Sam'),
    array(35, 'Bob')
    );
    $session = $cluster->connect("mykeyspace");
    $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?");
    $futures = array();
    // execute all statements in background
    foreach ($data as $arguments) {
    $options = new ExecutionOptions(array('arguments' => $arguments));
    $futures[]= $session->executeAsync($statement, $options);
    }
    // wait for all statements to complete
    foreach ($futures as $future) {
    // we will not wait for each result for more than 5 seconds
    $future->get(5);
    }

    View full-size slide

  33. © 2015 DataStax, All Rights Reserved.
    Parallel execution
    27
    $data = array(
    array(41, 'Sam'),
    array(35, 'Bob')
    );
    $session = $cluster->connect("mykeyspace");
    $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?");
    $futures = array();
    // execute all statements in background
    foreach ($data as $arguments) {
    $options = new ExecutionOptions(array('arguments' => $arguments));
    $futures[]= $session->executeAsync($statement, $options);
    }
    // wait for all statements to complete
    foreach ($futures as $future) {
    // we will not wait for each result for more than 5 seconds
    $future->get(5);
    }

    View full-size slide

  34. © 2015 DataStax, All Rights Reserved.
    Parallel execution
    28
    $data = array(
    array(41, 'Sam'),
    array(35, 'Bob')
    );
    $session = $cluster->connect("mykeyspace");
    $statement = $session->prepare("UPDATE users SET age = ? WHERE user_name = ?");
    $futures = array();
    // execute all statements in background
    foreach ($data as $arguments) {
    $options = new ExecutionOptions(array('arguments' => $arguments));
    $futures[]= $session->executeAsync($statement, $options);
    }
    // wait for all statements to complete
    foreach ($futures as $future) {
    // we will not wait for each result for more than 5 seconds
    $future->get(5);
    }

    View full-size slide

  35. Load Balancing
    Routing and Failover

    View full-size slide

  36. © 2015 DataStax, All Rights Reserved.
    Application Driver
    Load Balancing
    30
    Application
    Thread
    Node
    Pool
    Session
    Pool
    Pool
    Pool
    Application
    Thread
    Application
    Thread
    Client Cluster
    Node
    Node
    Node
    Load Balancing
    Policy

    View full-size slide

  37. © 2015 DataStax, All Rights Reserved.
    Application Driver
    Load Balancing
    30
    Application
    Thread
    Node
    Pool
    Session
    Pool
    Pool
    Pool
    Application
    Thread
    Application
    Thread
    Client Cluster
    Node
    Node
    Node
    Load Balancing
    Policy

    View full-size slide

  38. © 2015 DataStax, All Rights Reserved.
    Application Driver
    Load Balancing
    30
    Application
    Thread
    Node
    Pool
    Session
    Pool
    Pool
    Pool
    Application
    Thread
    Application
    Thread
    Client Cluster
    Node
    Node
    Node
    Load Balancing
    Policy

    View full-size slide

  39. © 2015 DataStax, All Rights Reserved.
    Datacenter
    Datacenter
    DataCenter Aware Balancing
    31
    Node
    Node
    Node
    Client
    Node
    Node
    Node
    Client
    Client
    Client
    Client
    Client
    Local nodes are queried
    first, if non are available,
    the request could be
    sent to a remote node.

    View full-size slide

  40. © 2015 DataStax, All Rights Reserved.
    Token Aware Balancing
    32
    Route request
    directly to Replicas
    Node
    Node
    Replica
    Node
    Client
    Replica
    Replica
    Uses prepared statement
    metadata to get the token

    View full-size slide

  41. © 2015 DataStax, All Rights Reserved.
    Latency Aware Balancing
    33
    Route requests to
    the fastest nodes
    Client
    Tracks response times for
    each node

    View full-size slide

  42. Fault Tolerance
    Failures and Error Handling

    View full-size slide

  43. © 2015 DataStax, All Rights Reserved.
    Fault Tolerance
    35
    Coordinator
    Node Replica
    Replica
    Replica
    Node
    Business Logic
    Driver
    Application

    View full-size slide

  44. © 2015 DataStax, All Rights Reserved. 36
    Coordinator
    Node Replica
    Replica
    Replica
    Node
    Business Logic
    Driver
    Application
    Invalid Requests
    Network Timeouts
    Server Errors
    Possible Failures

    View full-size slide

  45. © 2015 DataStax, All Rights Reserved.
    Application Driver
    Automatic Retry of Server Errors
    37
    Application
    Thread
    Node
    Pool
    Session
    Pool
    Pool
    Pool
    Application
    Thread
    Application
    Thread
    Client Cluster
    Node
    Node
    Node
    Load Balancing
    Policy

    View full-size slide

  46. © 2015 DataStax, All Rights Reserved.
    Application Driver
    Automatic Retry of Server Errors
    37
    Application
    Thread
    Node
    Pool
    Session
    Pool
    Pool
    Pool
    Application
    Thread
    Application
    Thread
    Client Cluster
    Node
    Node
    Node
    Load Balancing
    Policy

    View full-size slide

  47. © 2015 DataStax, All Rights Reserved.
    Application Driver
    Automatic Retry of Server Errors
    37
    Application
    Thread
    Node
    Pool
    Session
    Pool
    Pool
    Pool
    Application
    Thread
    Application
    Thread
    Client Cluster
    Node
    Node
    Node
    Load Balancing
    Policy

    View full-size slide

  48. © 2015 DataStax, All Rights Reserved. 38
    Coordinator
    Node Replica
    Replica
    Replica
    Node
    Business Logic
    Driver
    Application
    Unreachable Consistency

    View full-size slide

  49. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    39
    Replica
    Business Logic
    Driver
    Application
    Read / Write Timeout Error

    View full-size slide

  50. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    39
    Replica
    Business Logic
    Driver
    Application
    Read / Write Timeout Error

    View full-size slide

  51. © 2015 DataStax, All Rights Reserved.
    Coordinator
    Node Replica
    Replica
    Node
    39
    Replica
    Business Logic
    Driver
    Application
    Read / Write Timeout Error
    read / write timeout

    View full-size slide

  52. © 2015 DataStax, All Rights Reserved. 40
    Coordinator
    Node Replica
    Replica
    Replica
    Node
    Business Logic
    Driver
    Application
    Unavailable Error

    View full-size slide

  53. © 2015 DataStax, All Rights Reserved. 40
    Coordinator
    Node Replica
    Replica
    Replica
    Node
    Business Logic
    Driver
    Application
    Unavailable Error
    unavailable

    View full-size slide

  54. © 2015 DataStax, All Rights Reserved.
    Other
    • Persistent Sessions
    • Security
    • Datatypes
    41

    View full-size slide