Slide 1

Slide 1 text

Introduction to ArangoDB & Customer Data Platform Transforming your Customer Data Platform with Open Source Big Data Technology Presented by: Nguyễn Tấn Triều (Thomas) Founder of BigDataVietnam.org Email: [email protected] FB: https://facebook.com/tantrieuf31 Twitter: https://twitter.com/tantrieuf31

Slide 2

Slide 2 text

About myself ● Started BigDataVietnam.org as Knowledge Sharing Blog in 2014 ● Former Head of Platform at Blueseed Digital ● Former Lead Software Engineer at FPT Telecom ● Former Backend Engineer at Greengar ● Former Backend developer at FPT Online Details at https://www.linkedin.com/in/tantrieuf31/

Slide 3

Slide 3 text

AGENDA Why ArangoDB ? One database. One Query Language. Three data models. Endless Possibilities. Why Customer Data Platform (CDP) ? Introduction to USPA framework and customer data platform What are the cool features of ArangoDB which support to build CDP ? Flexibility, scalability and advanced graph queries DEMO with case studies 1. 2. 3. 4.

Slide 4

Slide 4 text

WARM UP QUESTION History of database technology in 5 minutes

Slide 5

Slide 5 text

Source: https://witanworld.com/blog/2019/05/23/database-a-general-introduction/

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Database Theory in 21st century Source: http://graphdatamodeling.com

Slide 8

Slide 8 text

The 21st century is the age of "Big Data". And Big Data need "use the right tool(s) for the job"

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

AGENDA Why ArangoDB ? One database. One Query Language. Three data models. Endless Possibilities. Why Customer Data Platform (CDP) ? Introduction to USPA framework and customer data platform What are cool features of ArangoDB which support to build CDP ? Flexibility, scalability and advanced graph queries DEMO with case studies 1. 2. 3. 4.

Slide 11

Slide 11 text

Source: https://martinfowler.com/bliki/PolyglotPersistence.html E-Commerce Data Infrastructure Example The "polyglot data persistence" concept

Slide 12

Slide 12 text

What is "Polyglot data persistence" ? Philosophy: "use the right tool(s) for the job" Assumption: specialized products are better suited than generic products Examples: ● The original RDBMS with a relational data model for financial data (such as checkouts, invoices, refunds and more) and reporting. ● MongoDB with a flexible document data model for product catalog. ● Cassandra for high volume use cases such as real-time analytics (using Apache Spark) and user activity logs. ● Riak key-value store for managing shopping carts. ● Redis for managing user sessions and an in-memory cache for low latency reads. ● Neo4J graph database for storing recommendations.

Slide 13

Slide 13 text

Issues with polyglot data persistence ● Requires learning, administering and maintaining multiple technologies ● Needs custom scripts and / or application logic for shipping data from one system to the other, or for syncing systems ● Potential atomicity and consistency issues across the different database systems (i.e. no transactions)

Slide 14

Slide 14 text

Multi-model databases to the rescue?

Slide 15

Slide 15 text

One engine. One query language. Multiple models.

Slide 16

Slide 16 text

ArangoDB is a native multi-model database, with support for key-values, documents, graphs and (recently added) search functionality

Slide 17

Slide 17 text

ArangoDB is a native multi-model database Graph database Document + Key-Value

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

Databases, collections, documents On the highest level, data in ArangoDB is organized in "databases" and "collections" (think "schemas" and "tables") ● Collections are used to store "documents" of similar types ● "Documents" are just JSON objects with arbitrary attributes, with optional nesting (sub-objects, sub-arrays) ● There is no fixed schema for documents (any JSON object is valid)

Slide 20

Slide 20 text

Homogeneous documents In the easiest case, documents in a collection are homogeneous (i.e. same attributes and types) Example use case: product categories { "_key" : "books", "title" : "Books" } { "_key" : "cam", "title" : "Camera products" } { "_key" : "kitchen", "title" : "Kitchen Appliances" } { "_key" : "toys", "title" : "Toys & Games" } Processing such data in AQL queries is as straightforward as with an SQL query on a relational, fixed-schema table

Slide 21

Slide 21 text

AQL queries – hello world examples // SELECT c.* FROM categories c WHERE c._key IN ... FOR c IN categories FILTER c._key IN [ 'books', 'kitchen' ] RETURN c // SELECT c._key, c.title FROM categories c ORDER BY c.title FOR c IN categories SORT c.title RETURN { _key: c._key, title: c.title }

Slide 22

Slide 22 text

Heterogeneous documents example { "_key" : "A053720452", "category" : "books", "name" : "Harry Potter and the Cursed Child", "author" : "Joanne K. Rowling", "isbn" : "978-0-7515-6535-5", "published" : 2016 } { "_key" : "ZB4061305X34", "category" : "toys", "name" : "Nerf N-Strike Elite Mega CycloneShock Blaster", "upc" : "630509278862", "colors" : [ "black", "red" ] }

Slide 23

Slide 23 text

The graph data model ● ArangoDB also supports the graph data model ● Graph queries can reveal which documents are directly or indirectly connected to which other documents, and via what connections ● Graphs are often used for data exploration, and to understand connections in the data

Slide 24

Slide 24 text

Edges ● In graphs, connections between documents are called "edges" ● In ArangoDB edges are stored in "edge collections" ● Edges have "_from" and "_to" attributes, which reference the connected vertices ● Edges are always directed (_from -> _to), but can also be queried in opposite order

Slide 25

Slide 25 text

Edge collection example Let's assume there are some "employees" documents like this: { "_key" : "sofia", "_id" : "employees/sofia" } { "_key" : "adam", "_id" : "employees/adam" } { "_key" : "sarah", "_id" : "employees/sarah" } { "_key" : "jon", "_id" : "employees/jon" } And there is an "isManagerOf" edge collection connecting them: { "_from" : "employees/sofia", "_to" : "employees/adam" } { "_from" : "employees/sofia", "_to" : "employees/sarah" } { "_from" : "employees/sarah", "_to" : "employees/jon" }

Slide 26

Slide 26 text

Graph example, employees graph

Slide 27

Slide 27 text

In-deep training about ArangoDB, please visit https://www.arangodb.com/arangodb-training-center/first-day/

Slide 28

Slide 28 text

AGENDA Why ArangoDB ? One database. One Query Language. Three data models. Endless Possibilities. Why Customer Data Platform (CDP) ? Introduction to USPA framework and customer data platform What are cool features of ArangoDB which support to build CDP ? Flexibility, scalability and advanced graph queries DEMO with case studies 1. 2. 3. 4.

Slide 29

Slide 29 text

Customer Journey Analysis, Customer Need Analysis & Digital Marketing Analytics are the top requirements from business Source: Gartner Investment in customer analytics (Q04,Q05) exclude unsure, n=142

Slide 30

Slide 30 text

Source: https://www.amazon.com/Predictive-Marketing-Marketer-Customer-Analytics/dp/1119037360 Customer Data is the new gold for business in 21st century

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

Key feature of CDP: Customer Identity Resolution as ID graph

Slide 34

Slide 34 text

https://bigdatavietnam.org/2019/09/uspatech-open-source-framework-to-build.html

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

How to apply CDP and Recommender System together

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

Customer Persona Prediction and Journey Map

Slide 40

Slide 40 text

Augmented/predictive analytics Increase analytics productivity to focus on better Customer Experience insights. Customer segmentation Identify, reach and communicate with specific groups of like-minded customers. Recommendation engines Increase retention and conversion by offering highly individualized digital experiences based on historical activity and preferences. Customer journey analytics & orchestration Reach the right customers at the right time and on the right channel to offer the optimal experience and maximize conversion Practical use cases of CDP for E-Commerce

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

AGENDA Why ArangoDB ? One database. One Query Language. Three data models. Endless Possibilities. Why Customer Data Platform (CDP) ? Introduction to USPA framework and customer data platform What are cool features of ArangoDB which support to build CDP ? Flexibility, scalability and advanced graph queries DEMO with case studies 1. 2. 3. 4.

Slide 43

Slide 43 text

ArangoDB is the perfect DB for social data analytics

Slide 44

Slide 44 text

In fact, CDP is also the data lake platform to store anything about customer profile

Slide 45

Slide 45 text

Data Lake: at first, store anything then do analytics later

Slide 46

Slide 46 text

Source: https://www.computer.org/publications/tech-news/research/customer-data-platform How CDPs tell a story about customers

Slide 47

Slide 47 text

Implicit data is gathered by predictive analytics. It also includes loyalty, length of relationship, purchasing history, and prior responses to marketing campaigns. Explicit data is easier to gather and analyze because it is usually provided to the company directly from the customer.

Slide 48

Slide 48 text

CDP need native multi-model database 1. Graph data model can be used for customer journey management 2. Homogeneous document model can be used for customer profile 3. Flexibility of ArangoDB is the key to scale system easier 4. Scalability of ArangoDB can help business analytics grow faster Graph database Document + Key-Value

Slide 49

Slide 49 text

Perfect Technology Stack for Customer Data Platform CDP https://github.com/bigdatavietnam-org/USPA.tech

Slide 50

Slide 50 text

AGENDA Why ArangoDB ? One database. One Query Language. Three data models. Endless Possibilities. Why Customer Data Platform (CDP) ? Introduction to USPA framework and customer data platform What are cool features of ArangoDB which support to build CDP ? Flexibility, scalability and advanced graph queries DEMO with case studies 1. 2. 3. 4.

Slide 51

Slide 51 text

Case study: Social Data Analytics with ArangoDB

Slide 52

Slide 52 text

Example problem: Classify data into 3 segments

Slide 53

Slide 53 text

User Story 1. You need to do social analytics to find key trends 2. You develop a social media crawler to crawl public data from FaceBook API 3. After crawling, you have 988 records in the collection “fb_feeds” The key task: classify all feeds into 3 segments 1. High value: top trending feeds that have more than 1000 likes 2. At risk: top feed with users that have any ANGRY reaction 3. Sell opportunity: top feed with users that have LOVE reaction

Slide 54

Slide 54 text

High value: top trending feeds that have more than 1000 likes

Slide 55

Slide 55 text

At risk: top feed with users that have any ANGRY reaction

Slide 56

Slide 56 text

Sell opportunity: top feed with users that have LOVE reaction

Slide 57

Slide 57 text

Case study: Customer Journey Analytics (CJA) with ArangoDB

Slide 58

Slide 58 text

User Story 1. An e-commerce website needs to track all data points in customer journey. 2. A web developer put a CDP JavaScript tag into website for collecting data. 3. After a week, they have a data collection for customer journey analytics The key task: classify all customer profiles into 3 segments 1. High value: spending more than 100 USD in a week 2. At risk: take web browsing more than 5 minutes but no order and do log-out 3. Cross-sell opportunity: is a student and "place an order" for things like "textbook"

Slide 59

Slide 59 text

Key takeaway

Slide 60

Slide 60 text

ArangoDB: One engine. One query language. Multiple models For more information: https://www.arangodb.com/arangodb-training-center/first-day/

Slide 61

Slide 61 text

CDP = Data + Smart Decision + Scalable Delivery

Slide 62

Slide 62 text

https://bit.ly/bdvn-cfm https://bit.ly/bdvn-cdp-intro https://bit.ly/bdvn-uspa-intro ArangoDB is the solution

Slide 63

Slide 63 text

Perfect Technology Stack for Customer Data Platform CDP https://github.com/bigdatavietnam-org/USPA.tech

Slide 64

Slide 64 text

Thank you! https://BigDataVietnam.org facebook.com/bigdatavn [email protected]