Slide 1

Slide 1 text

Real-time visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

Slide 2

Slide 2 text

follow the Hippo trail NoSQL Matters 2013 About me Jeroen Reijn Software engineer Hippo @jreijn http://blog.jeroenreijn.com

Slide 3

Slide 3 text

follow the Hippo trail NoSQL Matters 2013 About Hippo

Slide 4

Slide 4 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto Visitor Analysis

Slide 5

Slide 5 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto

Slide 6

Slide 6 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto

Slide 7

Slide 7 text

follow the Hippo trail NoSQL Matters 2013 Journey based Targeting

Slide 8

Slide 8 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto How we analyse visitors @ Hippo

Slide 9

Slide 9 text

follow the Hippo trail NoSQL Matters 2013 Registration Visitor - entity making HTTP requests Collector - records data about a visitor or his behaviour Example: location collector (GeoIPCollector) Targeting Data - all data about a specific visitor Example: IP address is located in Amsterdam

Slide 10

Slide 10 text

follow the Hippo trail NoSQL Matters 2013 Matching Characteristic - a type of fact about visitors Example: "comes from a city", "experiences a type of weather" Target Group - the specification of a Characteristic Example: "comes from a European city", "comes from Amsterdam" Persona - one or more target groups that describe a certain type of visitor Example: "Jim, the European urban consumer", "Alice, the Pet owner"

Slide 11

Slide 11 text

follow the Hippo trail NoSQL Matters 2013 What do we store? Request log ! Targeting data ! Statistics Averages, e.g. how many visitors became which persona

Slide 12

Slide 12 text

follow the Hippo trail NoSQL Matters 2013 Real-time analysis

Slide 13

Slide 13 text

follow the Hippo trail NoSQL Matters 2013 How about YOU? • Do you analyse your visitors? • Do you do it ‘real- time’?

Slide 14

Slide 14 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto Architecture

Slide 15

Slide 15 text

follow the Hippo trail NoSQL Matters 2013 RDBMS Hippo Delivery Tier Hippo Repository App server XML JSON (X)HTML

Slide 16

Slide 16 text

follow the Hippo trail NoSQL Matters 2013 Delivery Tier URL Matching Fetch content Compose output Request Response

Slide 17

Slide 17 text

follow the Hippo trail NoSQL Matters 2013 Delivery Tier URL Matching Collect data Compose output Request Response Fetch content Scoring

Slide 18

Slide 18 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto Scaling

Slide 19

Slide 19 text

follow the Hippo trail NoSQL Matters 2013 RDBMS Hippo Delivery Tier Hippo Repository App server Hippo Delivery Tier Hippo Repository App server Scaling out

Slide 20

Slide 20 text

follow the Hippo trail NoSQL Matters 2013 RDBMS Delivery Tier Repository App server Delivery Tier Repository App server Scaling out Targeting Datastore

Slide 21

Slide 21 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto What kind of storage?

Slide 22

Slide 22 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto Writer Single write Datastore Several reads Typical Data Access Pattern

Slide 23

Slide 23 text

follow the Hippo trail NoSQL Matters 2013 Analytics Data Access Pattern Writers Datastore Single read Several writes CMS user

Slide 24

Slide 24 text

follow the Hippo trail NoSQL Matters 2013 Targeting Data Access Pattern Visitors Datastore Single read Several writes Several reads CMS user

Slide 25

Slide 25 text

follow the Hippo trail NoSQL Matters 2013 Distributed Cache

Slide 26

Slide 26 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto Requirements change!

Slide 27

Slide 27 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto NoSQL ?

Slide 28

Slide 28 text

follow the Hippo trail NoSQL Matters 2013 Suitable types • Key-value store • Document database • Column oriented store

Slide 29

Slide 29 text

follow the Hippo trail NoSQL Matters 2013 Assessment Criteria Maturity Data model Consistency model Performance Replication Caching model Query model Monitoring Scalability Reliability Support

Slide 30

Slide 30 text

follow the Hippo trail NoSQL Matters 2013 Selection Criteria • Performance • Scalability • Schema flexibility • Simplicity

Slide 31

Slide 31 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto Couchbase

Slide 32

Slide 32 text

follow the Hippo trail NoSQL Matters 2013 Why Couchbase? • Drop-in replacement for memcached • Read/Write-through cache • High throughput • Easily scalable • Schema flexibility • Low latency

Slide 33

Slide 33 text

follow the Hippo trail NoSQL Matters 2013 Couchbase • Open Source • Document-oriented • Easy Scalable • Consistent High Performance • Apache licensed

Slide 34

Slide 34 text

follow the Hippo trail NoSQL Matters 2013 Performance • Object managed cache • Write Queue to disk

Slide 35

Slide 35 text

follow the Hippo trail NoSQL Matters 2013 Easy scalable • Auto sharding • Cross cluster replication (XDCR) • Master - Master replication

Slide 36

Slide 36 text

follow the Hippo trail NoSQL Matters 2013 Flexible data model • Native JSON support • Incremental Map Reduce • Gives power to the developer

Slide 37

Slide 37 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto How we run Couchbase @ Hippo

Slide 38

Slide 38 text

follow the Hippo trail NoSQL Matters 2013 Load Balancer Database cluster Hippo Delivery Tier Couchbase cluster •Request log data •Targeting data •Statistics data

Slide 39

Slide 39 text

follow the Hippo trail NoSQL Matters 2013 Analysis capabilities • Querying via views • Secondary indexes via views • Views based on Map - Reduce • Limited ad-hoc query capabilities

Slide 40

Slide 40 text

follow the Hippo trail NoSQL Matters 2013 Elasticsearch • Apache Lucene • Designed to be distributed • Schema free • Apache license • RESTful API

Slide 41

Slide 41 text

follow the Hippo trail NoSQL Matters 2013 Added value • Unstructured search • Structured search • Faceted search • Geo spatial search • Combinate all • All in (near) real-time

Slide 42

Slide 42 text

follow the Hippo trail NoSQL Matters 2013 Couchbase Server Cluster Elasticsearch Server Cluster Hippo Delivery Tier Java API Write Read Couchbase Transport plugin Replication XDCR Read / Query

Slide 43

Slide 43 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto What’s Next?

Slide 44

Slide 44 text

follow the Hippo trail NoSQL Matters 2013 Advanced analytics

Slide 45

Slide 45 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto { Demo }

Slide 46

Slide 46 text

follow the Hippo trail NoSQL Matters 2013 OneHippo @ Goto ! Thanks! ! [email protected] @jreijn www.onehippo.com