Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Stream processing made easy with riko

Stream processing made easy with riko

Interactive workshop on streams and stream processing with the Python library riko

Reuben Cummings

September 29, 2016
Tweet

More Decks by Reuben Cummings

Other Decks in Programming

Transcript

  1. Stream processing made easy with
    riko
    DevCraft - Nairobi, KE
    Sep 29, 2016
    by Reuben Cummings
    @reubano #DevCraftKE

    View full-size slide

  2. Who am I?
    @reubano #DevCraftKE
    Managing Director, Nerevu
    Development
    Lead organizer of Arusha Coders
    Author of several popular Python
    packages (riko, meza, pygogo)

    View full-size slide

  3. Topics & Format
    @reubano #DevCraftKE
    data, streams, and stream processing
    code samples and interactive exercises
    hands-on (don't be a spectator)

    View full-size slide

  4. what is data?

    View full-size slide

  5. structured unstructured
    Organization
    @reubano #DevCraftKE
    country capital
    Kenya Nairobi
    Tanzania Dodoma
    Rwanda Kigali
    "O God of all
    creation. Bless
    this our land
    and nation.
    Justice be our
    shield..."

    View full-size slide

  6. binary (hex dump)
    Storage
    @reubano #DevCraftKE
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    hexadecimal
    number

    View full-size slide

  7. Storage
    @reubano #DevCraftKE
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    binary (hex dump)
    1 byte

    View full-size slide

  8. Storage
    @reubano #DevCraftKE
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    8 bits
    binary (hex dump)

    View full-size slide

  9. Storage
    @reubano #DevCraftKE
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    2^8 = 256
    binary (hex dump)

    View full-size slide

  10. Storage
    @reubano #DevCraftKE
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
    00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
    0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408
    0 - 255
    binary (hex dump)

    View full-size slide

  11. flat/text
    Storage
    @reubano #DevCraftKE
    greeting,loc,rating
    hello,world,3
    good bye,moon,7
    welcome,stars,5
    what's up,sky,2

    View full-size slide

  12. binary flat/text
    Organization vs Storage
    @reubano #DevCraftKE
    structured
    unstructured
    maasai mara
    hell's gate

    View full-size slide

  13. [
    {
    "greeting": "hello",
    "location": "world",
    "enthusiasm": 3
    }, {
    "greeting": "good bye",
    "location": "moon",
    "enthusiasm": 7
    }
    ]

    View full-size slide

  14. what are streams?

    View full-size slide

  15. >>> stream = 'abracadabra'
    >>> stream[0]
    'a'
    >>> stream = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    >>> stream[0]
    1
    >>> stream = ['hello', 'devcraft', 'attendees']
    >>> stream[0]
    'hello'
    >>> stream = [
    ... {'num': 0}, {'num': 1}, {'num': 2}]
    >>> stream[0]
    {'num': 0}

    View full-size slide

  16. how do you
    construct
    streams?

    View full-size slide

  17. >>> stream = input('--> ')

    View full-size slide

  18. >>> stream = input('--> ')
    --->

    View full-size slide

  19. >>> stream = input('--> ')
    ---> abracadabra

    View full-size slide

  20. >>> stream = input('--> ')
    ---> abracadabra
    >>> s = 'hello devcraft attendees'
    >>> s.split(' ')
    ['hello', 'devcraft', 'attendees']
    >>> list(range(1, 11))
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    >>> stream
    'abracadabra'
    >>> [{'num': x} for x in range(4)]
    [{'num': 0}, {'num': 1}, {'num': 2}, {'num': 3}]

    View full-size slide

  21. how do you
    process streams?

    View full-size slide

  22. >>> ints = range(1, 10)
    >>> doubled = [2 * x for x in ints]
    >>> doubled
    [2, 4, 6, 8, 10, 12, 14, 16, 18]
    >>> big = [x for x in doubled if x > 10]
    >>> big
    [12, 14, 16, 18]
    >>> [x / 3 for x in big]
    [4.0, 4.6667, 5.3333, 6.0]
    >>> (x / 3 for x in big)
    at 0x103c10830>
    >>> next(x / 3 for x in big)
    4.0

    View full-size slide

  23. RSS feeds (feedly)

    View full-size slide

  24. aggregators (kayak)

    View full-size slide

  25. mashups (portwiture)

    View full-size slide

  26. introducing riko
    github.com/nerevu/riko

    View full-size slide

  27. let's get some
    data

    View full-size slide

  28. Kenya Open Data (opendata.go.ke)

    View full-size slide

  29. IPython Demo
    bit.ly/riko-demo
    (examples)

    View full-size slide

  30. IPython Demo
    bit.ly/riko-demo
    (exercises)

    View full-size slide

  31. number of schools
    per district

    View full-size slide

  32. [
    {'BUTERE/MUMIAS': 1},
    {'HOMA BAY': 1},
    {'KIAMBU': 1},
    {'MACHAKOS': 1},
    {'MAKUENI': 1},
    {'MARAGUA': 2},
    {'MBEERE': 1},
    {'MOMBASA': 2},
    {'NAIROBI': 5},
    {'TRANS NZOIA': 1}
    ]

    View full-size slide

  33. boarding only
    students per
    division

    View full-size slide

  34. [
    {'ASEGO': Decimal('277')},
    {'BUTERE': Decimal('224')},
    {'DAGORETTI': Decimal('903')},
    {'EMBAKASI': Decimal('138')},
    {'ISLAND': Decimal('14')},
    {'KANDARA': Decimal('74')},
    {'KASIKEU': Decimal('20')},
    {'KIBERA': Decimal('355')},
    {'KIKUYU': Decimal('69')},
    {'KISAUNI': Decimal('424')},
    ...
    ]

    View full-size slide

  35. create a stream
    process with the
    "joining" example

    View full-size slide

  36. github.com/reubano/
    devcraft-workshop

    View full-size slide

  37. github.com/reubano/
    riko

    View full-size slide

  38. Reuben Cummings
    [email protected]
    https://reubano.github.io
    Thanks!
    @reubano #DevCraftKE

    View full-size slide