Slide 1

Slide 1 text

Transactional vs. Analytical Processing Christopher Hertel

Slide 2

Slide 2 text

2

Slide 3

Slide 3 text

Let’s talk about data 3 … and decision making

Slide 4

Slide 4 text

What’s the best database? 4 „It depends …“

Slide 5

Slide 5 text

Depends on what? 5 People & Skills Infrastructure Current System Time & Budget Requirements • What kind of data? • What is the use-case?

Slide 6

Slide 6 text

Today’s Questions • What types of data are there? • How do we interact with data? • Why bother at all? And when? 6

Slide 7

Slide 7 text

Christopher Hertel Software Architect in Berlin @el_stoffel 7 Who is that guy?

Slide 8

Slide 8 text

Example Application 8 SymfonyConBot

Slide 9

Slide 9 text

• Telegram Bot “SymfonyConBot” • Features • Access SymfonyCon’s schedule • Lookup current and next talks • Search for talks and speakers • Confirm attendance of a talk • Rate a talk • Build with Symfony 9 Example Application

Slide 10

Slide 10 text

10 Example Application

Slide 11

Slide 11 text

Let’s have a look at the data 11 What types of data are there?

Slide 12

Slide 12 text

12 Example Application live.symfony.com Telegram App User • Id • First name • Last name Talk • Title • Speaker • Description • Slot • Track Slot • Start date • End date Track • Name Rating • Talk • User • Stars Attendance • Talk • User

Slide 13

Slide 13 text

Qualitative vs. Quantitative 13 What types of data are there?

Slide 14

Slide 14 text

14 Example Application live.symfony.com Telegram App User • Id • First name • Last name Talk • Title • Speaker • Description • Slot • Track Slot • Start date • End date Track • Name Rating • Talk • User • Stars Attendance • Talk • User

Slide 15

Slide 15 text

15 Qualitative Data User • Id • First name • Last name Talk • Title • Speaker • Description • Slot • Track Slot • Start date • End date Track • Name Rating • Talk • User • Stars Attendance • Talk • User Nominal Data • Names/Labels • Not ordered • Potentially static Ordinal Data • Ordered • Sequence • No arithmetics • Not numeric • Not measurable • Categorical data • Content Qualitative

Slide 16

Slide 16 text

16 Quantitative Data User • Id • First name • Last name Talk • Title • Speaker • Description • Slot • Track Slot • Start date • End date Track • Name Rating • Talk • User • Stars Attendance • Talk • User Discrete Data • Only integers • Limited • No subdivision • Quantified • Numeric • Measurable • Process Artifacts Quantitative Continuous Data • Processing • Any numeric value • Unlimited Average Rating of Talk

Slide 17

Slide 17 text

Quantitative • Quantified • Numeric • Measurable • Process Artifacts Discrete Data • Only integers • Limited • No subdivision Continuous Data • Processing • Any numeric value • Unlimited Qualitative • Not numeric • Not measurable • Categorical data • Content Nominal Data • Names/Labels • Not ordered • Potentially static Ordinal Data • Ordered • Sequence • No arithmetics Types of Data 17

Slide 18

Slide 18 text

18 Example Application live.symfony.com Telegram App User • Id • First name • Last name Talk • Title • Speaker • Description • Slot • Track Slot • Start date • End date Track • Name Rating • Talk • User • Stars Attendance • Talk • User

Slide 19

Slide 19 text

19 Example Application Interaction Conference User

Slide 20

Slide 20 text

Facts vs. Dimensions 20 What types of data are there?

Slide 21

Slide 21 text

21 Dimensions Interaction Conference User Conference • Basis of business process • Descriptive data • Horizontal tables • Larger share in schema • Initial structure Dimension Tables

Slide 22

Slide 22 text

22 Facts Interaction Conference User Interaction • Outcome of business process • Numbers & foreign keys • Vertical tables • Less often than dimensions • Comes after dimensions Fact Tables

Slide 23

Slide 23 text

Dimension Tables • Basis of business process • Descriptive data • Horizontal tables • Larger share in schema • Initial structure Slots & Talks Dimensions vs. Facts 23 Fact Tables • Outcome of business process • Numbers & foreign keys • Vertical tables • Less often than dimensions • Comes after dimensions Attendance & Ratings

Slide 24

Slide 24 text

24 Example Application live.symfony.com Telegram App User • Id • First name • Last name Talk • Title • Speaker • Description • Slot • Track Slot • Start date • End date Track • Name Rating • Talk • User • Stars Attendance • Talk • User

Slide 25

Slide 25 text

Data Processing 25 How do we interact with data?

Slide 26

Slide 26 text

OLTP vs. OLAP 26

Slide 27

Slide 27 text

OLTP vs. OLAP 27 Online Transactional Processing Online Analytical Processing

Slide 28

Slide 28 text

OLTP vs. OLAP 28 Online Transactional Processing Online Analytical Processing

Slide 29

Slide 29 text

OLTP vs. OLAP 29 Online Transactional Processing Online Analytical Processing

Slide 30

Slide 30 text

OLTP vs. OLAP 30 Transactional vs. Analytical

Slide 31

Slide 31 text

Transactional Processing[OLTP] • Focus on transaction • Operative systems • Validation and consistency • CRUD on single records • Real time • Well defined 31 Table

Slide 32

Slide 32 text

Analytical Processing[OLAP] • Focus on analysis • Reporting systems • Filtering and reducing • Across multiple records • Periodically • Variable 32 Table

Slide 33

Slide 33 text

33 Example Application Interaction Conference User FAKE

Slide 34

Slide 34 text

34 Schedule Management Interaction Conference User • Read & write • Relational data • Validity & consistency • No metrics nor analysis • Real time

Slide 35

Slide 35 text

35 Schedule Querying Interaction Conference User • Only read • Ordinal data • Sequence • Relational

Slide 36

Slide 36 text

36 Schedule Search Interaction Conference User • Only read • Nominal data • Fuzzy • Delay is okay Index

Slide 37

Slide 37 text

37 Attending & Rating a talk Interaction Conference User • Append only • Facts • Discrete data • Real time

Slide 38

Slide 38 text

38 Conference Analysis Interaction Conference User • Read only • Calculations • Filtering • Delay is okay Views

Slide 39

Slide 39 text

Why bother at all? And when? 39

Slide 40

Slide 40 text

Common pitfalls 40 • Monolithic persistence => Everything in the same tables • Missing isolation => Which classes write in this table? • Performance issues => Mixing engine & user land logic Maintaining, Slicing & Scaling Systems

Slide 41

Slide 41 text

Designing the Schema 41

Slide 42

Slide 42 text

Designing the Schema 42

Slide 43

Slide 43 text

43 Flow of Data OLTP OLAP Database INSERT INTO … SELECT X FROM …

Slide 44

Slide 44 text

44 Flow of Data OLTP OLAP Database INSERT INTO … SELECT X FROM … Database Worker

Slide 45

Slide 45 text

45 Flow of Data Database Interaction Conference OLTP WRITE OLTP WRITE OLAP READ OLAP READ

Slide 46

Slide 46 text

46 Flow of Data Conference OLTP WRITE OLAP READ Index Interaction View OLTP WRITE OLAP READ EventBus

Slide 47

Slide 47 text

47 Flow of Data Source, see “Further Reading”

Slide 48

Slide 48 text

48 Example Decisions OLTP OLAP QUALITATIVE QUANTITATIVE ELASTICSEARCH INFLUXDB MEILISEARCH REDIS CLICKHOUSE SUPERSET MONGO DB POSGRESQL MYSQL SQLITE

Slide 49

Slide 49 text

The Takeaways 49

Slide 50

Slide 50 text

The Takeaways • Consider the data structure & role in business processes early on • Focus on data processing while choosing persistence engine • Don’t always use MySQL, sane start though • Decouple data like you do with software components & services • Don’t avoid redundancy by all means • Consider the flow of data, evaluate on change & don’t act too early 50

Slide 51

Slide 51 text

New Features Requests 51 Requirements Type & Structure Maintaining, Slicing & Scaling a System Flow & Processing

Slide 52

Slide 52 text

Back to the start … 52

Slide 53

Slide 53 text

53

Slide 54

Slide 54 text

Redis CLI Application How is that solved? 54 OLTP WRITE

Slide 55

Slide 55 text

Redis CLI Application How is that solved? 55 Web Application Packagist Database PERIODICAL AGGREGATION OLTP WRITE OLAP READ

Slide 56

Slide 56 text

Thank you! 56

Slide 57

Slide 57 text

Questions? 57

Slide 58

Slide 58 text

Further Reading • SymfonyCon Bot Source code on GitHub github.com/chr-hertel/symfonycon-bot • 6 Types of Data in Statistics & Research: Key in Data Science By Silvia Valcheva www.intellspot.com/data-types/ • SQL vs. NoSQL Database: When to Use, How to Choose By Satish Chandra Gupta www.ml4devs.com/articles/datastore-choices-sql-vs-nosql-database/ • Difference between Fact Table and Dimension Table By MKS075 www.geeksforgeeks.org/difference-between-fact-table-and-dimension-table/ • Emerging Architectures for Modern Data Infrastructure By Bornstein, Li & Casado future.com/emerging-architectures-modern-data-infrastructure/ 58