$30 off During Our Annual Pro Sale. View Details »

Messy data != Messy code

Messy data != Messy code

The role of our API in Switzerland is to consume a lot of data that was not meant for a digital age and to transform it into beautiful output, for one of the biggest retailers in Switzerland. This is a journey of consuming a lot of data and APIs from different sources and in different formats. Some of them made us laugh, some of us got migraines. We built a smooth architecture to consume and output data. I am proud of our architecture that we seamlessly upgraded to keep the latest versions, now Symfony 4 along the way. I want to share with you how we managed to keep this API up to date for over 5 years and the architecture that we use to make it happen.

Michelle Sanver

February 20, 2020
Tweet

More Decks by Michelle Sanver

Other Decks in Programming

Transcript

  1. Messy Data != Messy Code
    An API built in Symfony, for one of the biggest retailers of Switzerland
    PHPUK 2020
    @michellesanver

    View Slide

  2. @michellesanver
    WIIIIE \o/
    “Learn the most by sharing
    Your knowledge with others”
    - @coderabbi

    View Slide

  3. View Slide

  4. @michellesanver
    Michelle Sanver
    Colour & Code addict

    View Slide

  5. @michellesanver
    Accent!?!?!?

    View Slide

  6. @michellesanver

    View Slide

  7. @michellesanver
    “We build a product that
    improves the way that Swiss
    people do shopping”
    - Michelle Sanver

    View Slide

  8. @michellesanver
    Disclaimer: 

    I do a lot of “ranting” in this talk.
    Retailer data is a complex business.
    We have nothing against our data
    providers.

    View Slide

  9. @michellesanver
    Talk for everyone
    Some concepts may be confusing

    View Slide

  10. @michellesanver
    Agenda
    – The project: A retailer of Switzerland
    – Challenges
    – Big API: Solving the serializer bottleneck
    – Importing: When your 3rd party data provider “lies” to you
    – Mapping: Contain the mess!
    – Evolving with Symfony in a long term project

    View Slide

  11. @michellesanver
    The Project

    View Slide

  12. @michellesanver
    The Project
    It started as a small API

    View Slide

  13. @michellesanver
    Huge Technology Stack

    View Slide

  14. @michellesanver
    API Platform
    I know you’ll ask, no we don’t use it

    View Slide

  15. DEV: Rae Knowler DEV: Tobias Schultze DEV: Christian Riesen DEV: Thereza Scherrer DEV: Martin Janser
    DEV: Emanuele Panzeri DEV: Michelle Sanver “Cloud Tamer”: Chregu
    PO: Timur Erdag PO: Colin Frei SM: Léo Davesne
    DEV: David Buchmann
    Team: 8 developers, 2 PO’s 1 SM, and… Chregu @michellesanver

    View Slide

  16. @michellesanver
    REST Controllers
    ElasticSearch
    MySQL
    The Data Provider
    Serializing
    Importing
    Mapping
    Frontend

    View Slide

  17. @michellesanver
    REST Controllers
    ElasticSearch
    MySQL
    The Data Provider
    Serializing
    Importing
    Mapping
    Frontend

    View Slide

  18. @michellesanver
    Challenges

    View Slide

  19. @michellesanver
    Our API is HUGE
    Code & Data

    View Slide

  20. @michellesanver
    /src
    8.7 MB: 2’067 items

    View Slide

  21. @michellesanver
    /tests
    10.3 MB: 1’089 items

    View Slide

  22. @michellesanver
    /config
    1.8 MB: 329 items

    View Slide

  23. @michellesanver
    /vendor
    163.6 MB: 21’763 items

    View Slide

  24. @michellesanver
    /src
    /Api
    /Client
    /Infrastructure
    /Migration
    / … A few lose things that makes sense, like
    Serializer
    Structure & Naming

    View Slide

  25. @michellesanver
    Importing a lot of data
    From several sources

    View Slide

  26. @michellesanver
    REST Controllers
    ElasticSearch
    MySQL
    The Data Provider
    Serializing
    Importing
    Mapping
    Frontend

    View Slide

  27. @michellesanver
    REST Controllers
    ElasticSearch
    MySQL
    The Data Provider
    Serializing
    Importing
    Mapping
    Frontend

    View Slide

  28. @michellesanver
    REST Controllers
    ElasticSearch
    MySQL
    The Data ProviderS
    Serializing
    Importing A LOT
    Mapping
    Frontend

    View Slide

  29. @michellesanver
    Importing Data
    With import commands

    View Slide

  30. @michellesanver
    Storing Original Data
    Importing into MySQL

    View Slide

  31. @michellesanver
    Importing a lot of
    data

    View Slide

  32. @michellesanver
    Queues & Workers
    With Autoscaling

    View Slide

  33. @michellesanver
    Switching to
    Symfony Messenger

    View Slide

  34. @michellesanver

    View Slide

  35. @michellesanver

    View Slide

  36. @michellesanver
    Switching to messenger simplified our code A LOT

    View Slide

  37. @michellesanver
    It forced us to use more value objects

    View Slide

  38. @michellesanver
    bin/console messenger:consume
    From a crazy amount of commands, making bin/console
    difficult to overview without grep to… This:

    View Slide

  39. @michellesanver

    View Slide

  40. @michellesanver
    Switching to Messenger was well worth the time

    View Slide

  41. @michellesanver
    And I love that we could give back to the

    Symfony community with it

    View Slide

  42. @michellesanver
    I can exit the
    worker
    with ctrl+c now

    View Slide

  43. @michellesanver
    Consuming “bad”
    API’s
    Without crying or becoming an
    alcoholic

    View Slide

  44. @michellesanver
    Soapish

    View Slide

  45. @michellesanver
    Restful Soap

    View Slide

  46. @michellesanver
    The “Flexible” Api

    View Slide

  47. @michellesanver
    Humor
    Write songs, laugh

    View Slide

  48. @michellesanver
    Thank you
    By the way, we’re hiring ;)
    Michelle Sanver
    [email protected]

    View Slide

  49. @michellesanver
    Pairing is caring
    When we suffer, we can suffer together

    View Slide

  50. @michellesanver
    We’re super heroes
    Our consumers are shielded
    from the “pain” we have

    View Slide

  51. @michellesanver
    Makes me feel good
    about our API

    View Slide

  52. @michellesanver
    3rd party data
    providers
    Lies. All lies!!

    View Slide

  53. @michellesanver
    Pairing is caring
    When we suffer, we can suffer together

    View Slide

  54. @michellesanver
    JSON Schema
    We can scream at them now ;)

    View Slide

  55. @michellesanver
    JSON Schema

    View Slide

  56. JSON Schema
    We can scream at them now ;)

    View Slide

  57. @michellesanver

    View Slide

  58. JSON Schema
    We can scream at them now ;)

    View Slide

  59. @michellesanver
    Defensive Programming
    There’s no such thing as
    “This won’t happen”

    View Slide

  60. @michellesanver
    Extremely Defensive PHP
    - Marco Pivetta

    View Slide

  61. @michellesanver
    Switching to several
    data sources

    View Slide

  62. @michellesanver
    “Decider Service”
    Ooh, that ID? You’re from API X

    View Slide

  63. @michellesanver
    Importing is “easy”
    But mapping…

    View Slide

  64. @michellesanver
    Data Quality…

    View Slide

  65. @michellesanver
    • Missing spaces
    • String instead of int
    • Array instead of object
    • Object instead of string
    • Differently named fields
    • Required data missing
    • … And more

    View Slide

  66. @michellesanver
    • Missing spaces
    • String instead of int
    • Array instead of object
    • Object instead of string
    • Differently named fields
    • Required data missing
    • … And more

    View Slide

  67. @michellesanver
    Mappers
    From MySQL to ElasticSearch

    View Slide

  68. @michellesanver
    ProductMapper Product on
    ElasticSearch
    /products.json

    View Slide

  69. @michellesanver
    ProductMapper
    Name
    Category
    Brand
    Price
    Description

    View Slide

  70. @michellesanver
    ProductMapper
    N B C P D

    View Slide

  71. @michellesanver
    ProductMapper

    View Slide

  72. @michellesanver
    ProductMapper

    View Slide

  73. @michellesanver
    ProductMapper

    View Slide

  74. @michellesanver

    View Slide

  75. @michellesanver
    MapperInterface

    View Slide

  76. @michellesanver
    ProductFactory
    Data From
    MySQL
    Clean Data
    To store in ES

    View Slide

  77. @michellesanver
    Config of all the
    mappers, in order.
    7 depends on 3
    4 depends on 1
    25 depends on
    basically everything
    1 2 3 4 5
    6 7 8 9 10
    11 12 13 14 15
    16 17 18 19 20
    21 22 23 24 25
    ProductFactory

    View Slide

  78. @michellesanver
    ! "
    #
    $

    View Slide

  79. @michellesanver
    Data From
    MySQL
    Clean Data
    To store in ES
    1 2 3 4 5
    6 7 8 9 10
    11 12 13 14 15
    16 17 18 19 20
    21 22 23 24 25
    ProductFactory * 3

    View Slide

  80. @michellesanver
    Mapper Dependencies
    With a compiler pass!

    View Slide

  81. @michellesanver

    View Slide

  82. @michellesanver

    View Slide

  83. @michellesanver

    View Slide

  84. @michellesanver

    View Slide

  85. @michellesanver
    Dealing with languages
    Only when you have to!

    View Slide

  86. @michellesanver

    View Slide

  87. @michellesanver

    View Slide

  88. @michellesanver
    Ensuring Quality
    Tests, Logging,
    Monitoring

    View Slide

  89. @michellesanver
    Log a lot
    “Debugging you” will love you

    View Slide

  90. @michellesanver
    Monitoring
    Queues, uptime, etc.

    View Slide

  91. @michellesanver
    Acceptance Tests
    Test for critical data, often

    View Slide

  92. @michellesanver
    Project Challenges
    Big API responses

    View Slide

  93. @michellesanver

    View Slide

  94. @michellesanver
    Serializing,
    Versioning &
    Groups

    View Slide

  95. @michellesanver
    Our needs
    Versioning & Groups

    View Slide

  96. @michellesanver
    We tried it all
    And it “sucks".

    View Slide

  97. @michellesanver
    Plain JSON decode
    json_decode, json_encode

    View Slide

  98. @michellesanver
    Symfony Serializer
    It’s cool and all

    View Slide

  99. @michellesanver
    Better Serializer
    Maybe it’d be better, if we could make it work

    View Slide

  100. @michellesanver
    JMS
    What 99% of PHP developers need for serialization

    View Slide

  101. @michellesanver
    JMS
    Annotations

    View Slide

  102. @michellesanver
    JMS
    Version Support
    @until, @since

    View Slide

  103. @michellesanver
    JMS
    Virtual Properties

    View Slide

  104. @michellesanver
    JMS
    Works like magic with most
    frameworks, including Symfony which
    we use

    View Slide

  105. @michellesanver
    JMS

    View Slide

  106. @michellesanver
    JMS
    Read The Docs ;)
    https://jmsyst.com/libs/serializer

    View Slide

  107. @michellesanver
    We called “visitProperty”
    over 60 000 times!!!

    View Slide

  108. @michellesanver
    The Liip/Serializer
    It’s more of a generator, really.

    View Slide

  109. @michellesanver
    Model
    Parser
    SerializerGenerator
    DeserializerGenerator

    View Slide

  110. @michellesanver
    An overall performance gain of 55% over
    JMS for our use-case
    390 ms => 175 ms

    CPU and I/O wait both down by ~50%.
    Memory gain: 21%, 6.5 MB => 5.15 MB

    View Slide

  111. @michellesanver
    Curious about our Serializer?
    https://www.liip.ch/en/blog/fast-serialization-with-liip-serializer

    https://github.com/liip/serializer

    View Slide

  112. @michellesanver
    Project Challenges
    Communication is hard

    View Slide

  113. @michellesanver
    Feedback
    Because we have to work together

    View Slide

  114. @michellesanver
    Retrospective
    Let’s look back and improve

    View Slide

  115. @michellesanver
    Code Reviews
    Respectfully improving code together

    View Slide

  116. @michellesanver
    Team Events
    Keeps moral high

    View Slide

  117. @michellesanver
    An amazing customer
    They listen to us, it’s important

    View Slide

  118. @michellesanver
    Evolving With Symfony

    In a long term project

    View Slide

  119. @michellesanver
    Prioritise upgrades
    Upgrade ASAP!

    View Slide

  120. @michellesanver
    Prioritise upgrades
    Upgrade minor versions

    View Slide

  121. @michellesanver
    Prioritise upgrades
    Fix deprecation warnings

    View Slide

  122. @michellesanver
    Refactor Often
    It’s not optional

    View Slide

  123. @michellesanver
    Use new components

    View Slide

  124. @michellesanver
    Contribute to Open Source
    Feel good to give something back
    And… Have some control over our tools

    View Slide

  125. @michellesanver
    Tests!! Lots of tests
    (Do I still need to emphasise this?)

    View Slide

  126. @michellesanver
    An amazing customer

    View Slide

  127. @michellesanver
    Final Words

    View Slide

  128. @michellesanver
    Write dev docs
    Don’t repeat our mistake

    View Slide

  129. @michellesanver
    Defensive
    programming
    Refactoring
    often
    Working as a Team

    View Slide

  130. @michellesanver
    Messy Data
    !==
    Messy Code

    View Slide

  131. @michellesanver
    Thank you
    By the way, we’re hiring ;)
    Michelle Sanver
    [email protected]

    View Slide