Slide 1

Slide 1 text

Stream'Data'Processing'with'Kinesis'and'Go'at' Timehop Avi$Flax June%2015

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Background • Whenever(a(user(opens(the(app,(the(app(no0fies(the(API • We(do(4(things(with(this(data • (1)(Count(daily(unique(user(opens • (2)(Record(the(last(opened(0me(for(each(user • (3)(Update(user(data • (4)(Archive(the(app(open(event

Slide 4

Slide 4 text

Prior%System

Slide 5

Slide 5 text

Interlude)I:)Jay)Kreps)❤s)Logs • Jan%2011:%Ka+a%released%as%FLOSS • Jul%2011:%Ka+a%entered%Apache%incuba=on • Nov%2012:%Ka+a%graduated%incuba=on • Dec%2013:%Kreps%published%The%Log:%What%every%soGware% engineer%should%know%about%realJ=me%data's%unifying%abstrac=on • Sep%2014:%published%in%book%form:%I%❤%Logs:%Event%Data,%Stream% Processing,%and%Data%Integra=on • Nov%2014:%coJfounded%Confluent

Slide 6

Slide 6 text

Interlude)II:)Unified)Logs)in)a)Nutshell • A#very#specific#approach#to#streaming#data#transport • a#server#providing#“an#append6only,#totally6ordered#sequence#of# records#ordered#by#=me” • Decouples:+producers+&+consumers;+transport+&+processing;+ consumers)&)consumers

Slide 7

Slide 7 text

Interlude)III:)AWS)Kinesis …recently)Amazon)has)offered)a)service)that)is)very)very)similar)to) Ka6a)called)Kinesis…)I)was)pre:y)happy)about)this. —"Jay"Kreps

Slide 8

Slide 8 text

Interlude)IV:)Go • An$object+oriented$systems$language$with$GC

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

The$Workers

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

Producer(/(apicard • We’re&using&Sendgrid’s&go-kinesis&library&to&produce&records& to&Kinesis • We&pre7y&quickly&ran&into&some&issues: • Kinesis&some

Slide 16

Slide 16 text

go-kinesis/batchproducer type Producer interface { Start() error Stop() error Add(data []byte, partitionKey string) error Flush(timeout time.Duration, sendStats bool) (sent int, remaining int, err error) } // BatchingKinesisClient is a subset of KinesisClient to ease mocking. type BatchingKinesisClient interface { PutRecords(args *kinesis.RequestArgs) (resp *kinesis.PutRecordsResp, err error) } func New(client BatchingKinesisClient, streamName string, config Config) (Producer, error)

Slide 17

Slide 17 text

API$Startup AppOpenBatcher, err = batchproducer.New(ksis, appOpenStreamName, config) if err != nil { golog.Fatal(gologID, "Oh noes!") } AppOpenBatcher.Start()

Slide 18

Slide 18 text

App#Opens#API#Resource func enqueueAppOpenEventForStream(event appopen.AppOpen) { avrobytes, err := event.ToAvro() if err != nil { ... return } partitionKey := strconv.Itoa(time.Unix(event.Timestamp, 0).Second()) err = conf.AppOpenBatcher.Add(avrobytes, partitionKey) if err != nil { ... return } }

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

Streams(Repo Four%notable%concepts/packages: • models • package kclmultilang • tasks • cmd

Slide 21

Slide 21 text

package …/streams/models/appopen type AppOpen struct{ Timestamp int64 UserID int64 ... } func FromAvro(record []byte) (*AppOpen, error) func (s AppOpen) ToAvro() ([]byte, error) func (s AppOpen) Validate() error

Slide 22

Slide 22 text

kclmultilang type Config struct { StreamName string WorkerName string ... } func RunWithSingleProcessor(Config, SingleEventProcessor) type SingleEventProcessor func(appopen.AppOpen, log.Logger) error

Slide 23

Slide 23 text

Task%RecordLastOpen // RecordLastOpen updates a certain Redis key for each user that // stores the last time that user opened the Timehop mobile app. func RecordLastOpen( event appopen.AppOpen, redis redis.Pool, logger log.Logger ) error { key := fmt.Sprintf("user:%v:checkpoint", event.UserID) field := fmt.Sprintf("%v_app_open", event.Platform) value := fmt.Sprint(event.Timestamp) _, err := redis.HSet(key, field, value) return err }

Slide 24

Slide 24 text

Worker&lastopens cmd/appopen/lastopens/main.go func main() { resultsRedisURL := env.MandatoryVar("RESULTS_REDIS_URL") streamName := env.MandatoryVar("STREAM_NAME") logFilePath := env.ImportantVar("LOG_PATH", defaultLogFilePath) config := kclmultilang.Config{...} resultsRedisPool := redis.NewPool(resultsRedisURL, redis.DefaultConfig) processor := func(record appopen.AppOpen, logger log.Logger) error { return lastopens.RecordLastOpen(record, resultsRedisPool, logger) } kclmultilang.RunWithSingleProcessor(config, processor) }

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Lessons&Learned,&Hints,&Tips,&and&Miscellany • Didn’t(need(to(write(kclmultilang(—(could(have((should(have)( used(Niek(Sanders’(gokinesis • Probably(could(have(delayed(adding(the(batch(producer(to(go- kinesis(—(could(have(lived(without(it • Deployment(is(the(other(90%

Slide 27

Slide 27 text

Ques%ons,)Comments,)Sugges%ons?

Slide 28

Slide 28 text

More%Resources • The%Log:%an%epic%so0ware%engineering%ar3cle%by%Bryan%Pendleton • The%three%eras%of%business%data%processing%by%Alex%Dean • Loving%a%LogAOriented%Architecture%by%Andrew%Montalen3 • Stream%Processing,%Event%Sourcing,%Reac3ve,%CEP…%And%Making% Sense%Of%It%All%by%Mar3n%Kleppman And$a$whole$bunch$more$here$including$books$and$videos.

Slide 29

Slide 29 text

Bonus:'AWS'Lambda'and'Go • Ruben'Fonseca:'AWS'Lambda'Func4ons'in'Go

Slide 30

Slide 30 text

AWS$Lambda$Go$Adapter exports.handler = function(event, context, test_config) { var config = test_config || prod_config; var options = { env: config.env, input: JSON.stringify(event) } var result = child_process.spawnSync(config.child_path, [], options); if (result.status !== 0) { return context.fail(new Error(result.stderr.toString())); } context.succeed(); }