Slide 1

Slide 1 text

Stream processing made easy with riko DevCraft - Nairobi, KE Sep 29, 2016 by Reuben Cummings @reubano #DevCraftKE

Slide 2

Slide 2 text

Who am I? @reubano #DevCraftKE Managing Director, Nerevu Development Lead organizer of Arusha Coders Author of several popular Python packages (riko, meza, pygogo)

Slide 3

Slide 3 text

Topics & Format @reubano #DevCraftKE data, streams, and stream processing code samples and interactive exercises hands-on (don't be a spectator)

Slide 4

Slide 4 text

what is data?

Slide 5

Slide 5 text

structured unstructured Organization @reubano #DevCraftKE country capital Kenya Nairobi Tanzania Dodoma Rwanda Kigali "O God of all creation. Bless this our land and nation. Justice be our shield..."

Slide 6

Slide 6 text

binary (hex dump) Storage @reubano #DevCraftKE 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 hexadecimal number

Slide 7

Slide 7 text

Storage @reubano #DevCraftKE 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 binary (hex dump) 1 byte

Slide 8

Slide 8 text

Storage @reubano #DevCraftKE 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 8 bits binary (hex dump)

Slide 9

Slide 9 text

Storage @reubano #DevCraftKE 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 2^8 = 256 binary (hex dump)

Slide 10

Slide 10 text

Storage @reubano #DevCraftKE 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408 0 - 255 binary (hex dump)

Slide 11

Slide 11 text

flat/text Storage @reubano #DevCraftKE greeting,loc,rating hello,world,3 good bye,moon,7 welcome,stars,5 what's up,sky,2

Slide 12

Slide 12 text

binary flat/text Organization vs Storage @reubano #DevCraftKE structured unstructured maasai mara hell's gate

Slide 13

Slide 13 text

sample json

Slide 14

Slide 14 text

[ { "greeting": "hello", "location": "world", "enthusiasm": 3 }, { "greeting": "good bye", "location": "moon", "enthusiasm": 7 } ]

Slide 15

Slide 15 text

what are streams?

Slide 16

Slide 16 text

>>> stream = 'abracadabra' >>> stream[0] 'a' >>> stream = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] >>> stream[0] 1 >>> stream = ['hello', 'devcraft', 'attendees'] >>> stream[0] 'hello' >>> stream = [ ... {'num': 0}, {'num': 1}, {'num': 2}] >>> stream[0] {'num': 0}

Slide 17

Slide 17 text

how do you construct streams?

Slide 18

Slide 18 text

>>> stream = input('--> ')

Slide 19

Slide 19 text

>>> stream = input('--> ') --->

Slide 20

Slide 20 text

>>> stream = input('--> ') ---> abracadabra

Slide 21

Slide 21 text

>>> stream = input('--> ') ---> abracadabra >>> s = 'hello devcraft attendees' >>> s.split(' ') ['hello', 'devcraft', 'attendees'] >>> list(range(1, 11)) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] >>> stream 'abracadabra' >>> [{'num': x} for x in range(4)] [{'num': 0}, {'num': 1}, {'num': 2}, {'num': 3}]

Slide 22

Slide 22 text

how do you process streams?

Slide 23

Slide 23 text

>>> ints = range(1, 10) >>> doubled = [2 * x for x in ints] >>> doubled [2, 4, 6, 8, 10, 12, 14, 16, 18] >>> big = [x for x in doubled if x > 10] >>> big [12, 14, 16, 18] >>> [x / 3 for x in big] [4.0, 4.6667, 5.3333, 6.0] >>> (x / 3 for x in big) at 0x103c10830> >>> next(x / 3 for x in big) 4.0

Slide 24

Slide 24 text

so what!

Slide 25

Slide 25 text

RSS feeds (feedly)

Slide 26

Slide 26 text

aggregators (kayak)

Slide 27

Slide 27 text

mashups (portwiture)

Slide 28

Slide 28 text

introducing riko github.com/nerevu/riko

Slide 29

Slide 29 text

let's get some data

Slide 30

Slide 30 text

Kenya Open Data (opendata.go.ke)

Slide 31

Slide 31 text

API access

Slide 32

Slide 32 text

IPython Demo bit.ly/riko-demo (examples)

Slide 33

Slide 33 text

IPython Demo bit.ly/riko-demo (exercises)

Slide 34

Slide 34 text

exercise #1

Slide 35

Slide 35 text

number of schools per district

Slide 36

Slide 36 text

[ {'BUTERE/MUMIAS': 1}, {'HOMA BAY': 1}, {'KIAMBU': 1}, {'MACHAKOS': 1}, {'MAKUENI': 1}, {'MARAGUA': 2}, {'MBEERE': 1}, {'MOMBASA': 2}, {'NAIROBI': 5}, {'TRANS NZOIA': 1} ]

Slide 37

Slide 37 text

exercise #2

Slide 38

Slide 38 text

boarding only students per division

Slide 39

Slide 39 text

[ {'ASEGO': Decimal('277')}, {'BUTERE': Decimal('224')}, {'DAGORETTI': Decimal('903')}, {'EMBAKASI': Decimal('138')}, {'ISLAND': Decimal('14')}, {'KANDARA': Decimal('74')}, {'KASIKEU': Decimal('20')}, {'KIBERA': Decimal('355')}, {'KIKUYU': Decimal('69')}, {'KISAUNI': Decimal('424')}, ... ]

Slide 40

Slide 40 text

exercise #3

Slide 41

Slide 41 text

create a stream process with the "joining" example

Slide 42

Slide 42 text

github.com/reubano/ devcraft-workshop

Slide 43

Slide 43 text

github.com/reubano/ riko

Slide 44

Slide 44 text

Reuben Cummings reubano@gmail.com https://reubano.github.io Thanks! @reubano #DevCraftKE