Slide 1

Slide 1 text

NoSQL Not only a fairy tale Sebastian Cohnen @tisba tisba.de Timo Derstappen @teemow adcloud.com http://en.wikipedia.org/wiki/File:Old_book_-_Timeless_Books.jpg

Slide 2

Slide 2 text

Preface

Slide 3

Slide 3 text

Terms • placement & ads • ad priority

Slide 4

Slide 4 text

System Overview • administrative back office • worker queue • almost no NoSQL • serving ads • tracking • here be NoSQLs! platform adserver publishing ads & placements stats & tracking data

Slide 5

Slide 5 text

Once upon a time… …way back in 2008

Slide 6

Slide 6 text

Simple Storage Service

Slide 7

Slide 7 text

Publishing to S3 • gather ad & placement data • add some JavaScript • publish everything to S3

Slide 8

Slide 8 text

Ad Delivery via S3 • user visits a website • deliver JavaScript via CDN • choose and display ads

Slide 9

Slide 9 text

but, • publishing to S3 was rather expensive • no incremental update of denormalized data

Slide 10

Slide 10 text

The relaxed Knight …came along in 2009

Slide 11

Slide 11 text

CouchDB • REST & JavaScript? nice! • M/R Views • Multi-Master setup platform adserver adserver adserver

Slide 12

Slide 12 text

CouchDB only • normalize the data (a bit) • split by update frequency • BUT… n-m relations are hard to model • and persistent, incremental views are rather useless to us

Slide 13

Slide 13 text

:-(

Slide 14

Slide 14 text

CouchDB + node.js • use node.js to assemble data (n-m relation) • cache response using nginx • also cache some data in node.js

Slide 15

Slide 15 text

Request flow • incoming request • nginx cache miss • fetch placement & priorities • process data & fetch ads • send response

Slide 16

Slide 16 text

How to monitor Consistency? • write tracer documents • measure replication delay

Slide 17

Slide 17 text

Achievements • reduced turnaround for publishing priorities by >50% • build foundation for new features

Slide 18

Slide 18 text

New Feature Requests …ahead in early 2011

Slide 19

Slide 19 text

The Problem • requests eventually are going to be unique • therefor less requests can be cached • CouchDB too slow for our needs • caching things within a node.js process was a bad idea too

Slide 20

Slide 20 text

Redis • during a cache warmup phase we pre-fill redis with placement and ad data • all live request are served out of redis • data is updated in the background

Slide 21

Slide 21 text

…in late 2011 Scalability

Slide 22

Slide 22 text

How we used CouchDB • >10k updates/h • single source of changes • multi-master replication • append-only • durability • MVCC usage not required

Slide 23

Slide 23 text

Resulting Issues • problems with replication and high load • more instances, more replication, even more load • compaction was a pain too

Slide 24

Slide 24 text

Whose fault? • not only CouchDB’s fault • simply the wrong use case • one source for updates • no need for append-only reliability

Slide 25

Slide 25 text

What now?

Slide 26

Slide 26 text

Back to S3! • with Redis caching in place… • move placement and ad data to S3 • cache warming upfront and background updates work just fine!

Slide 27

Slide 27 text

S3 vs CouchDB • S3 simply fits our needs • no need to implement sync checks or run compaction • fewer moving parts • less state on our application servers

Slide 28

Slide 28 text

Once again, more features …ahead in early 2012

Slide 29

Slide 29 text

Status Quo • first S3-based “adserver” did the ad selection on the client side • to a certain degree this is still the case

Slide 30

Slide 30 text

The Challenge • prepare the systems for Real-time bidding • enable the adserver to decide ad selection server-side • do it fast, say within 25ms or less

Slide 31

Slide 31 text

Remember Redis? • we know and trust Redis’ performance • it has sorted sets • we have sets of ads to display for a placement Eureka!

Slide 32

Slide 32 text

Redis Reloaded! • heavily use sorted sets • create sets of ads… • we can choose from • which cannot be displayed at all • use ZUNIONSTORE & ZRANGEBYSCORE to precisely select ads

Slide 33

Slide 33 text

Redis Reloaded! • Redis became a deeply integrated part of the core business logic • it was very easy to model our needs with Redis • besides enabling new features, we reduced the response payload by >75%

Slide 34

Slide 34 text

Conclusion

Slide 35

Slide 35 text

• try to go as incremental as possible • drivers for architectural decisions… • features • quality & performance • scalability What worked for us…

Slide 36

Slide 36 text

The End!

Slide 37

Slide 37 text

• Questions (if time permits) • Visit us at the adcloud booth Sebastian Cohnen @tisba tisba.de Timo Derstappen @teemow adcloud.com The End!