Slide 1

Slide 1 text

PROYECTOS CORPORATIVOS DE DATOS DESAFÍOS PARA ARMAR UNA PLATAFORMA DE BIG DATA GUSTAVO ARJONES | GUSTAVO@ARJONES.NET | @ARJONES | LINKEDIN/ARJONES

Slide 2

Slide 2 text

GUSTAVO ARJONES gustavo@arjones.net | @arjones

Slide 3

Slide 3 text

GUSTAVO ARJONES > MBA Marketing; Licenciado Ciencias de la Computación gustavo@arjones.net | @arjones

Slide 4

Slide 4 text

GUSTAVO ARJONES > MBA Marketing; Licenciado Ciencias de la Computación > CTO de Extendeal gustavo@arjones.net | @arjones

Slide 5

Slide 5 text

GUSTAVO ARJONES > MBA Marketing; Licenciado Ciencias de la Computación > CTO de Extendeal > Ex-CTO/co-founder de Socialmetrix gustavo@arjones.net | @arjones

Slide 6

Slide 6 text

GUSTAVO ARJONES > MBA Marketing; Licenciado Ciencias de la Computación > CTO de Extendeal > Ex-CTO/co-founder de Socialmetrix gustavo@arjones.net | @arjones

Slide 7

Slide 7 text

GUSTAVO ARJONES > MBA Marketing; Licenciado Ciencias de la Computación > CTO de Extendeal > Ex-CTO/co-founder de Socialmetrix > Data Geek gustavo@arjones.net | @arjones

Slide 8

Slide 8 text

GUSTAVO ARJONES > MBA Marketing; Licenciado Ciencias de la Computación > CTO de Extendeal > Ex-CTO/co-founder de Socialmetrix > Data Geek > Armado/implementación de arquitecturas de Big Data gustavo@arjones.net | @arjones

Slide 9

Slide 9 text

GUSTAVO ARJONES > MBA Marketing; Licenciado Ciencias de la Computación > CTO de Extendeal > Ex-CTO/co-founder de Socialmetrix > Data Geek > Armado/implementación de arquitecturas de Big Data > Machine Learning, DevOps, etc gustavo@arjones.net | @arjones

Slide 10

Slide 10 text

AL PRINCIPIO, HABÍA OLTP gustavo@arjones.net | @arjones

Slide 11

Slide 11 text

DATABASE REPLICA gustavo@arjones.net | @arjones

Slide 12

Slide 12 text

ANALYTICS? OLAP + ETL gustavo@arjones.net | @arjones

Slide 13

Slide 13 text

ETL COMPLEJOS + DATOS NO-ESTRUCTURADOS gustavo@arjones.net | @arjones

Slide 14

Slide 14 text

HADOOP gustavo@arjones.net | @arjones

Slide 15

Slide 15 text

LENTO Y DIFÍCIL OPERAR PERO "ANDA" gustavo@arjones.net | @arjones

Slide 16

Slide 16 text

SE PONE MÁS COMPLEJO gustavo@arjones.net | @arjones

Slide 17

Slide 17 text

ADEMÁS SEGUÍMOS EN BATCH Y NECESITAMOS DE SOLUCIONES REALTIME gustavo@arjones.net | @arjones

Slide 18

Slide 18 text

LAMBDA ARCHITECTURE BATCH (HADOOP) STREAM (STORM) gustavo@arjones.net | @arjones

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

TURNING THE DATABASE INSIDE-OUT gustavo@arjones.net | @arjones

Slide 21

Slide 21 text

TURNING THE DATABASE INSIDE-OUT > Turning the database inside-out with Apache Samza gustavo@arjones.net | @arjones

Slide 22

Slide 22 text

TURNING THE DATABASE INSIDE-OUT > Turning the database inside-out with Apache Samza > Apache Kafka, Samza, and the Unix Philosophy of Distributed Data gustavo@arjones.net | @arjones

Slide 23

Slide 23 text

TURNING THE DATABASE INSIDE-OUT > Turning the database inside-out with Apache Samza > Apache Kafka, Samza, and the Unix Philosophy of Distributed Data > The Log: What every software engineer should know about real- time data's unifying abstraction gustavo@arjones.net | @arjones

Slide 24

Slide 24 text

KAPPA ARCHITECTURE http://www.kappa-architecture.com

Slide 25

Slide 25 text

DESIGNING DATA- INTENSIVE APPLICATIONS https://www.safaribooksonline.com/library/view/designing-data-intensive- applications/9781491903063/

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

EVENT SOURCING https://martinfowler.com/eaaDev/EventSourcing.html

Slide 28

Slide 28 text

TRANSACTIONS? ROLLBACK / COMMIT gustavo@arjones.net | @arjones

Slide 29

Slide 29 text

WRITE-AHEAD LOG gustavo@arjones.net | @arjones

Slide 30

Slide 30 text

SOURCE OF TRUTH + ML TRAINING gustavo@arjones.net | @arjones

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

FOSS FRAMEWORKS & TOOLING gustavo@arjones.net | @arjones

Slide 33

Slide 33 text

FOSS FRAMEWORKS & TOOLING > Apache Beam gustavo@arjones.net | @arjones

Slide 34

Slide 34 text

FOSS FRAMEWORKS & TOOLING > Apache Beam > Apache Spark | Apache Flink gustavo@arjones.net | @arjones

Slide 35

Slide 35 text

FOSS FRAMEWORKS & TOOLING > Apache Beam > Apache Spark | Apache Flink > Apache Kafka | Apache Pulsar gustavo@arjones.net | @arjones

Slide 36

Slide 36 text

FOSS FRAMEWORKS & TOOLING > Apache Beam > Apache Spark | Apache Flink > Apache Kafka | Apache Pulsar > Streamsets | Apache NiFi gustavo@arjones.net | @arjones

Slide 37

Slide 37 text

FOSS FRAMEWORKS & TOOLING > Apache Beam > Apache Spark | Apache Flink > Apache Kafka | Apache Pulsar > Streamsets | Apache NiFi > Debezium gustavo@arjones.net | @arjones

Slide 38

Slide 38 text

FOSS FRAMEWORKS & TOOLING > Apache Beam > Apache Spark | Apache Flink > Apache Kafka | Apache Pulsar > Streamsets | Apache NiFi > Debezium > Presto gustavo@arjones.net | @arjones

Slide 39

Slide 39 text

FOSS FRAMEWORKS & TOOLING > Apache Beam > Apache Spark | Apache Flink > Apache Kafka | Apache Pulsar > Streamsets | Apache NiFi > Debezium > Presto > Apache Airflow gustavo@arjones.net | @arjones

Slide 40

Slide 40 text

ON-PREM VS. CLOUD APACHE KAFKA, PULSAR VS. AWS KINESIS, AZURE EVENTHUB, ETC gustavo@arjones.net | @arjones

Slide 41

Slide 41 text

STREAMING SYSTEMS THE WHAT, WHERE, WHEN, AND HOW OF LARGE-SCALE DATA PROCESSING https://www.safaribooksonline.com/library/view/streaming-systems/ 9781491983867/

Slide 42

Slide 42 text

DO YOU REALLY NEED BIG DATA? POSTGRES + POSTGIS + TIMESCALE EXTENSION = ❤ Reading suggestions: Reasons to Fall in Love for Postgres | Postgres: Not Your Grandfather’s RDBMS

Slide 43

Slide 43 text

POSTGRES + SOURCE OF TRUTH + ML TRAINING gustavo@arjones.net | @arjones

Slide 44

Slide 44 text

HAPPINESS gustavo@arjones.net | @arjones

Slide 45

Slide 45 text

GRACIAS | OBRIGADO GUSTAVO ARJONES | GUSTAVO@ARJONES.NET | @ARJONES | LINKEDIN/ARJONES gustavo@arjones.net | @arjones