Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Event Stream Processing In PHP

Ian Barber
February 22, 2013

Event Stream Processing In PHP

From #phpuk13

Ian Barber

February 22, 2013
Tweet

More Decks by Ian Barber

Other Decks in Programming

Transcript

  1. PROCESSING IN PHP
    event stream

    View Slide

  2. IAN BARBER
    https://github.com/ianbarber/eep-php
    https://joind.in/8040
    [email protected]

    View Slide

  3. http://goo.gl/7IHBZ
    Blatant Plug: If you’re interested
    in this topic, try the (rather
    short) free ebook!

    View Slide

  4. PREDICTION
    We have many tools for analysing data
    and making predictions in buckets
    (RDBMs, Hadoop, BigQuery)

    View Slide

  5. PREDICTION
    But can we do the same thing with
    data-in-motion, analysing data as it
    is created?

    View Slide

  6. EVENT
    ( A B C
    )
    We can process data in the form of events: programmatic
    representations of occurrences in the real world or other
    systems. For our purposes they’re just a tuple, or a list.

    View Slide

  7. EVENT STREAM
    ( A B C
    )
    ( A B C
    )
    ( A B C
    )
    XOM 88.52
    XOM 88.56
    XOM 88.57
    An event stream is a series of events of (usually) the same
    type - for example a series of tick prices for a stock.

    View Slide

  8. EVENT CLOUD
    ( A B
    )
    ( A B
    )
    ( A B
    )
    FTSE 6318
    FTSE 6320
    FTSE 6321
    ( C D
    )
    ( C D
    )
    ( C D
    )
    An event cloud is a set of related, but not necessarily
    directly interactive streams. There is usually a non-trivially
    complex relationship between them - e.g. a stock market
    index.

    View Slide

  9. EEP.js
    https://github.com/darach/eep-js
    embedding event
    processing
    Darach Ennis took the evented, callback based Node.js
    and built the basic structures needed to do event stream
    processing. By focusing on the simplest, most common
    tools, it can be very, very fast!

    View Slide

  10. X
    STREAM OPERATIONS
    We just need a
    handful of
    tools to do
    some really
    interesting
    things...

    View Slide

  11. X
    STREAM OPERATIONS
    mem
    map
    pipe
    window
    aggregate

    View Slide

  12. http://reactphp.org

    View Slide

  13. X
    map
    pipe
    EVENT FUNCTIONS
    In PHP, we can use React to give us flexible, unix style
    pipes, and we have all the power of a general purpose
    language to map values from one form to another.

    View Slide

  14. require "vendor/autoload.php";
    $loop = React\EventLoop\Factory::create();
    $socket = new React\Socket\Server($loop);
    class UpperStream extends React\Stream\ThroughStream {
    public function filter($data) {
    return strtoupper($data);
    }
    }
    $filter = new UpperStream();
    $socket->on('connection', function($conn) use ($filter) {
    $conn->pipe($filter)->pipe($conn);
    });
    $socket->listen(4000);
    $loop->run();
    react.php
    With React we can
    create an event loop,
    and pipe the
    connection just like a
    unix pipe.
    We can use ThroughStream
    pipes to build mapping or
    transform functions.

    View Slide

  15. 1 2 3 4 5 6 7 8 9 8 7 6
    WINDOWS
    Windows allow us to frame a calculation
    across a (potentially infinite!) stream.

    View Slide

  16. 1 2 3 4 5 6 7 8 9 8 7 6
    1 2 3 4
    5 6 7 8
    9 8 7 6
    emit
    emit
    emit
    TUMBLING WINDOW
    Tumbling windows accept
    items up to their size, then
    emit the calculation and
    open a new window.

    View Slide

  17. require __DIR__.'/../vendor/autoload.php';
    echo "Monitor combined variables\n";
    $all_fn = new React\EEP\Stats\All;
    $all_win = new React\EEP\Window\Tumbling($all_fn, 100);
    $all_win->on('emit', function($vals) {
    if($vals['stdevs']>92 && $vals['mean']>280 &&
    $vals['max']>400) {
    printf("Stddev %.2fms Average %dms Max %dms - ALERT\n",
    $vals['stdevs'], $vals['mean'], $vals['max']);
    }
    });
    $start = microtime(true);
    for($i = 0; $i < 50000; $i++) {
    $var = 275 + rand(-150, 150);
    $all_win->enqueue($var);
    } servermon.php
    ‘All’ is our aggregate
    stats function.
    When the tumbling window closes, it
    emits a value which we listen for.

    View Slide

  18. 1 2 3 4 5 6 7 8 9 8 7 6
    emit
    3:40 3:40 3:41 3:42 3:45 3:48 3:49 3:50 3:52 3:53 3:53 3:54
    emit
    PERIODIC
    5 MINUTE WALL CLOCK
    Periodic windows have a variable number
    of events per window (maybe 0) - because
    they open and closed based on time.

    View Slide

  19. require __DIR__.'/../vendor/autoload.php';
    echo "Simple low rate detector\n";
    $count_fn = new React\EEP\Stats\Count;
    $win = new React\EEP\Window\Periodic($count_fn, 1000 * 5);
    $win->on('emit', function($count) {
    if($count < 50) {
    echo "Alert - Low Rate! - $count\n";
    } else {
    echo "$count :)\n";
    }
    });
    while(true) {
    $win->enqueue(array(300, 4, 5, 10));
    $win->tick();
    usleep(100000 + rand(-20000, 20000));
    }
    lowrate.php
    This example triggers an alert if
    the count per 5 seconds in less
    than 50 - this could be a count
    of orders or sign ups on a site.

    View Slide

  20. 1 2 3 4 5 6 7 8 9 8 7 6
    1 2 3 4 emit
    2 3 4 5
    3 4 5 6
    4 5 6 7
    emit
    emit
    emit
    5 6 7 8 emit
    SLIDING WINDOW

    View Slide

  21. STREAM FUNCTIONS
    mem aggregate
    We don’t just have to use the basic
    statistical functions, but can create
    custom functions to handle other cases.

    View Slide

  22. class TickDetector implements Aggregator {
    private $state, $data, $time;
    public function init() {
    $this->state = 0; $this->data = array();
    $this->time = null;
    }
    public function accumulate($v) {
    switch($this->state) {
    case 0: // Initial state. Always accept the value.
    $this->data[0] = $v->value; $this->time = $v->at;
    $this->state = 1;
    break;
    case 1: // Look for a drop.
    if($v->value < $this->data[0]) {
    $this->data[1] = $v->value;
    $this->state = 2;
    } else {
    $this->reset($v);
    }
    ...
    tick.php
    Patterns can be matched using
    a finite state machine - this one
    uses a switch statement and
    state tracker variable.

    View Slide

  23. class WideFinderFunction implements Aggregator {
    private $re, $keys;
    public function __construct($regex) {
    $this->re = $regex;
    }
    public function init() {
    $this->keys = array();
    }
    public function accumulate($line) {
    preg_match($this->re, $line, $matches);
    if(count($matches) == 0) { return; }
    $key = $matches[1];
    if(!isset($this->keys[$key])) {
    $this->keys[$key] = 1;
    } else {
    $this->keys[$key]++;
    }
    }
    }
    widefinder.php
    We can process text data -
    WideFinder updates a hash table
    of URLs as log events arrive and
    emits the count for each page.

    View Slide

  24. $fn = new Semijoin();
    $a_win = new React\EEP\Window\Sliding($fn, 100);
    $b_win = new React\EEP\Window\Sliding($fn, 100);
    $counter = 0;
    $match = function($match) use (&$counter) {
    if($match) {
    $counter++;
    printf("%s \n", $match[1]);
    }
    };
    $a_win->on("emit", $match);
    $b_win->on("emit", $match); tuplejoin.php (1)
    We can look for events between streams using a semijoin
    - which just checks there is a matching value in the
    window for the other stream. This is a custom aggregate
    function shared between two sliding windows.

    View Slide

  25. class Semijoin implements Aggregator {
    private $stream, $last_value;
    public function __construct() {
    $this->stream = array();
    }
    public function init() {
    $this->stream[0] = array(); $this->stream[1] = array();
    $this->last_value = null;
    }
    public function accumulate($v) {
    $this->last_value = null;
    $key = $v->value[0];
    if(!isset($this->stream[$v->stream][$key])) {
    $this->stream[$v->stream][$key] = 0;
    }
    $this->stream[$v->stream][$key] += 1;
    if(isset($this->stream[$v->stream ^ 1][$key])) {
    $this->last_value = $v->value;
    }
    }
    } tuplejoin.php (2)
    The join aggregate
    uses a stream ID on
    each event, and a
    hashtable of keys.

    View Slide

  26. private function correlate($xo, $x_stats, $yo, $y_stats) {
    $acc = 0;
    for($i = 0; $i < $this->sub; $i++) {
    $acc += ($this->buffers[0][($xo + $i) % $this->size]
    - $x_stats['mean']) *
    ($this->buffers[1][($yo + $i) % $this->size]
    - $y_stats['mean']);
    }
    $acc /= $this->sub;
    $stddevs = ($x_stats['stdevs'] * $y_stats['stdevs']);
    return ($stddevs == 0 ? 0 : $acc /$stddevs) ;
    }
    leader.php (1)
    We can even do complex interactions. This example
    attempts to correlate between several windows over a pair
    of streams. We run the function below against each pair to
    see if one stream predicts the movement of the other.

    View Slide

  27. $a = array(1, 1, 1, 3, 3, 3, 5, 5, 5, 7, 7, 7, 9, 9, 9, 11,
    11, 11);
    $b = array(1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 5, 5, 5, 7,
    7, 7);
    for($i = 0; $i < count($a); $i++) {
    $win->enqueue(new React\EEP\Event\Muxed($a[$i], 0));
    $win->enqueue(new React\EEP\Event\Muxed($b[$i], 1));
    }
    leader.php (2)
    $ php examples/leader.php
    Stream 1 leads stream 0 by 9 ticks with a -6.68 correlation
    Stream 0 leads stream 1 by 3 ticks with a 8.82 correlation
    Stream 0 leads stream 1 by 3 ticks with a 7.21 correlation
    Stream 0 leads stream 1 by 3 ticks with a 5.84 correlation
    Stream 0 leads stream 1 by 3 ticks with a 4.57 correlation
    Stream 0 leads stream 1 by 3 ticks with a 4.28 correlation
    Stream 0 leads stream 1 by 3 ticks with a 3.75 correlation
    Stream 0 leads stream 1 by 3 ticks with a 4.32 correlation

    View Slide

  28. X
    STREAM OPERATIONS
    With just some simple examples, we can handle a wide
    range of use cases. These types of windows and functions
    can be implemented in any language or framework!

    View Slide

  29. THANK YOU
    https://github.com/ianbarber/eep-php
    embedding event processing
    https://github.com/darach/eep-js
    https://joind.in/8040
    [email protected]
    http://profiles.google.com/ianbarber
    http://twitter.com/ianbarber

    View Slide