Slide 1

Slide 1 text

Messy Data != Messy Code An API built in Symfony, for one of the biggest retailers of Switzerland PHPUK 2020 @michellesanver

Slide 2

Slide 2 text

@michellesanver WIIIIE \o/ “Learn the most by sharing Your knowledge with others” - @coderabbi

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

@michellesanver Michelle Sanver Colour & Code addict

Slide 5

Slide 5 text

@michellesanver Accent!?!?!?

Slide 6

Slide 6 text

@michellesanver

Slide 7

Slide 7 text

@michellesanver “We build a product that improves the way that Swiss people do shopping” - Michelle Sanver

Slide 8

Slide 8 text

@michellesanver Disclaimer: 
 I do a lot of “ranting” in this talk. Retailer data is a complex business. We have nothing against our data providers.

Slide 9

Slide 9 text

@michellesanver Talk for everyone Some concepts may be confusing

Slide 10

Slide 10 text

@michellesanver Agenda – The project: A retailer of Switzerland – Challenges – Big API: Solving the serializer bottleneck – Importing: When your 3rd party data provider “lies” to you – Mapping: Contain the mess! – Evolving with Symfony in a long term project

Slide 11

Slide 11 text

@michellesanver The Project

Slide 12

Slide 12 text

@michellesanver The Project It started as a small API

Slide 13

Slide 13 text

@michellesanver Huge Technology Stack

Slide 14

Slide 14 text

@michellesanver API Platform I know you’ll ask, no we don’t use it

Slide 15

Slide 15 text

DEV: Rae Knowler DEV: Tobias Schultze DEV: Christian Riesen DEV: Thereza Scherrer DEV: Martin Janser DEV: Emanuele Panzeri DEV: Michelle Sanver “Cloud Tamer”: Chregu PO: Timur Erdag PO: Colin Frei SM: Léo Davesne DEV: David Buchmann Team: 8 developers, 2 PO’s 1 SM, and… Chregu @michellesanver

Slide 16

Slide 16 text

@michellesanver REST Controllers ElasticSearch MySQL The Data Provider Serializing Importing Mapping Frontend

Slide 17

Slide 17 text

@michellesanver REST Controllers ElasticSearch MySQL The Data Provider Serializing Importing Mapping Frontend

Slide 18

Slide 18 text

@michellesanver Challenges

Slide 19

Slide 19 text

@michellesanver Our API is HUGE Code & Data

Slide 20

Slide 20 text

@michellesanver /src 8.7 MB: 2’067 items

Slide 21

Slide 21 text

@michellesanver /tests 10.3 MB: 1’089 items

Slide 22

Slide 22 text

@michellesanver /config 1.8 MB: 329 items

Slide 23

Slide 23 text

@michellesanver /vendor 163.6 MB: 21’763 items

Slide 24

Slide 24 text

@michellesanver /src /Api /Client /Infrastructure /Migration / … A few lose things that makes sense, like Serializer Structure & Naming

Slide 25

Slide 25 text

@michellesanver Importing a lot of data From several sources

Slide 26

Slide 26 text

@michellesanver REST Controllers ElasticSearch MySQL The Data Provider Serializing Importing Mapping Frontend

Slide 27

Slide 27 text

@michellesanver REST Controllers ElasticSearch MySQL The Data Provider Serializing Importing Mapping Frontend

Slide 28

Slide 28 text

@michellesanver REST Controllers ElasticSearch MySQL The Data ProviderS Serializing Importing A LOT Mapping Frontend

Slide 29

Slide 29 text

@michellesanver Importing Data With import commands

Slide 30

Slide 30 text

@michellesanver Storing Original Data Importing into MySQL

Slide 31

Slide 31 text

@michellesanver Importing a lot of data

Slide 32

Slide 32 text

@michellesanver Queues & Workers With Autoscaling

Slide 33

Slide 33 text

@michellesanver Switching to Symfony Messenger

Slide 34

Slide 34 text

@michellesanver

Slide 35

Slide 35 text

@michellesanver

Slide 36

Slide 36 text

@michellesanver Switching to messenger simplified our code A LOT

Slide 37

Slide 37 text

@michellesanver It forced us to use more value objects

Slide 38

Slide 38 text

@michellesanver bin/console messenger:consume From a crazy amount of commands, making bin/console difficult to overview without grep to… This:

Slide 39

Slide 39 text

@michellesanver

Slide 40

Slide 40 text

@michellesanver Switching to Messenger was well worth the time

Slide 41

Slide 41 text

@michellesanver And I love that we could give back to the Symfony community with it

Slide 42

Slide 42 text

@michellesanver I can exit the worker with ctrl+c now

Slide 43

Slide 43 text

@michellesanver Consuming “bad” API’s Without crying or becoming an alcoholic

Slide 44

Slide 44 text

@michellesanver Soapish

Slide 45

Slide 45 text

@michellesanver Restful Soap

Slide 46

Slide 46 text

@michellesanver The “Flexible” Api

Slide 47

Slide 47 text

@michellesanver Humor Write songs, laugh

Slide 48

Slide 48 text

@michellesanver Thank you By the way, we’re hiring ;) Michelle Sanver [email protected]

Slide 49

Slide 49 text

@michellesanver Pairing is caring When we suffer, we can suffer together

Slide 50

Slide 50 text

@michellesanver We’re super heroes Our consumers are shielded from the “pain” we have

Slide 51

Slide 51 text

@michellesanver Makes me feel good about our API

Slide 52

Slide 52 text

@michellesanver 3rd party data providers Lies. All lies!!

Slide 53

Slide 53 text

@michellesanver Pairing is caring When we suffer, we can suffer together

Slide 54

Slide 54 text

@michellesanver JSON Schema We can scream at them now ;)

Slide 55

Slide 55 text

@michellesanver JSON Schema

Slide 56

Slide 56 text

JSON Schema We can scream at them now ;)

Slide 57

Slide 57 text

@michellesanver

Slide 58

Slide 58 text

JSON Schema We can scream at them now ;)

Slide 59

Slide 59 text

@michellesanver Defensive Programming There’s no such thing as “This won’t happen”

Slide 60

Slide 60 text

@michellesanver Extremely Defensive PHP - Marco Pivetta

Slide 61

Slide 61 text

@michellesanver Switching to several data sources

Slide 62

Slide 62 text

@michellesanver “Decider Service” Ooh, that ID? You’re from API X

Slide 63

Slide 63 text

@michellesanver Importing is “easy” But mapping…

Slide 64

Slide 64 text

@michellesanver Data Quality…

Slide 65

Slide 65 text

@michellesanver • Missing spaces • String instead of int • Array instead of object • Object instead of string • Differently named fields • Required data missing • … And more

Slide 66

Slide 66 text

@michellesanver • Missing spaces • String instead of int • Array instead of object • Object instead of string • Differently named fields • Required data missing • … And more

Slide 67

Slide 67 text

@michellesanver Mappers From MySQL to ElasticSearch

Slide 68

Slide 68 text

@michellesanver ProductMapper Product on ElasticSearch /products.json

Slide 69

Slide 69 text

@michellesanver ProductMapper Name Category Brand Price Description

Slide 70

Slide 70 text

@michellesanver ProductMapper N B C P D

Slide 71

Slide 71 text

@michellesanver ProductMapper

Slide 72

Slide 72 text

@michellesanver ProductMapper

Slide 73

Slide 73 text

@michellesanver ProductMapper

Slide 74

Slide 74 text

@michellesanver

Slide 75

Slide 75 text

@michellesanver MapperInterface

Slide 76

Slide 76 text

@michellesanver ProductFactory Data From MySQL Clean Data To store in ES

Slide 77

Slide 77 text

@michellesanver Config of all the mappers, in order. 7 depends on 3 4 depends on 1 25 depends on basically everything 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ProductFactory

Slide 78

Slide 78 text

@michellesanver ! " # $

Slide 79

Slide 79 text

@michellesanver Data From MySQL Clean Data To store in ES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ProductFactory * 3

Slide 80

Slide 80 text

@michellesanver Mapper Dependencies With a compiler pass!

Slide 81

Slide 81 text

@michellesanver

Slide 82

Slide 82 text

@michellesanver

Slide 83

Slide 83 text

@michellesanver

Slide 84

Slide 84 text

@michellesanver

Slide 85

Slide 85 text

@michellesanver Dealing with languages Only when you have to!

Slide 86

Slide 86 text

@michellesanver

Slide 87

Slide 87 text

@michellesanver

Slide 88

Slide 88 text

@michellesanver Ensuring Quality Tests, Logging, Monitoring

Slide 89

Slide 89 text

@michellesanver Log a lot “Debugging you” will love you

Slide 90

Slide 90 text

@michellesanver Monitoring Queues, uptime, etc.

Slide 91

Slide 91 text

@michellesanver Acceptance Tests Test for critical data, often

Slide 92

Slide 92 text

@michellesanver Project Challenges Big API responses

Slide 93

Slide 93 text

@michellesanver

Slide 94

Slide 94 text

@michellesanver Serializing, Versioning & Groups

Slide 95

Slide 95 text

@michellesanver Our needs Versioning & Groups

Slide 96

Slide 96 text

@michellesanver We tried it all And it “sucks".

Slide 97

Slide 97 text

@michellesanver Plain JSON decode json_decode, json_encode

Slide 98

Slide 98 text

@michellesanver Symfony Serializer It’s cool and all

Slide 99

Slide 99 text

@michellesanver Better Serializer Maybe it’d be better, if we could make it work

Slide 100

Slide 100 text

@michellesanver JMS What 99% of PHP developers need for serialization

Slide 101

Slide 101 text

@michellesanver JMS Annotations

Slide 102

Slide 102 text

@michellesanver JMS Version Support @until, @since

Slide 103

Slide 103 text

@michellesanver JMS Virtual Properties

Slide 104

Slide 104 text

@michellesanver JMS Works like magic with most frameworks, including Symfony which we use

Slide 105

Slide 105 text

@michellesanver JMS

Slide 106

Slide 106 text

@michellesanver JMS Read The Docs ;) https://jmsyst.com/libs/serializer

Slide 107

Slide 107 text

@michellesanver We called “visitProperty” over 60 000 times!!!

Slide 108

Slide 108 text

@michellesanver The Liip/Serializer It’s more of a generator, really.

Slide 109

Slide 109 text

@michellesanver Model Parser SerializerGenerator DeserializerGenerator

Slide 110

Slide 110 text

@michellesanver An overall performance gain of 55% over JMS for our use-case 390 ms => 175 ms
 CPU and I/O wait both down by ~50%. Memory gain: 21%, 6.5 MB => 5.15 MB

Slide 111

Slide 111 text

@michellesanver Curious about our Serializer? https://www.liip.ch/en/blog/fast-serialization-with-liip-serializer
 https://github.com/liip/serializer

Slide 112

Slide 112 text

@michellesanver Project Challenges Communication is hard

Slide 113

Slide 113 text

@michellesanver Feedback Because we have to work together

Slide 114

Slide 114 text

@michellesanver Retrospective Let’s look back and improve

Slide 115

Slide 115 text

@michellesanver Code Reviews Respectfully improving code together

Slide 116

Slide 116 text

@michellesanver Team Events Keeps moral high

Slide 117

Slide 117 text

@michellesanver An amazing customer They listen to us, it’s important

Slide 118

Slide 118 text

@michellesanver Evolving With Symfony
 In a long term project

Slide 119

Slide 119 text

@michellesanver Prioritise upgrades Upgrade ASAP!

Slide 120

Slide 120 text

@michellesanver Prioritise upgrades Upgrade minor versions

Slide 121

Slide 121 text

@michellesanver Prioritise upgrades Fix deprecation warnings

Slide 122

Slide 122 text

@michellesanver Refactor Often It’s not optional

Slide 123

Slide 123 text

@michellesanver Use new components

Slide 124

Slide 124 text

@michellesanver Contribute to Open Source Feel good to give something back And… Have some control over our tools

Slide 125

Slide 125 text

@michellesanver Tests!! Lots of tests (Do I still need to emphasise this?)

Slide 126

Slide 126 text

@michellesanver An amazing customer

Slide 127

Slide 127 text

@michellesanver Final Words

Slide 128

Slide 128 text

@michellesanver Write dev docs Don’t repeat our mistake

Slide 129

Slide 129 text

@michellesanver Defensive programming Refactoring often Working as a Team

Slide 130

Slide 130 text

@michellesanver Messy Data !== Messy Code

Slide 131

Slide 131 text

@michellesanver Thank you By the way, we’re hiring ;) Michelle Sanver [email protected]