Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ISTA 2019 - Migrating data-intensive microservices from Python to Go

ISTA 2019 - Migrating data-intensive microservices from Python to Go

In order for our systems to scale continuously and be resilient, they need to be constantly evolving. In this talk, I’m going to tell the store of how my team migrated a data-intensive microservice from Python to Go. First, we are going to start with the rationale behind the migration. Then we are going to go over the Python and Go tech stacks that we use. Last but not least, I’m also going to share our approach for migrating the service while running in production, adding new features and making sure there are no regressions.

Nikolay Stoitsev

November 15, 2019
Tweet

More Decks by Nikolay Stoitsev

Other Decks in Technology

Transcript

  1. View Slide

  2. Migrating data-intensive
    microservices from Python to Go
    Nikolay Stoitsev
    Engineering Manager @ Uber

    View Slide

  3. Early years
    Dispatch API Storage

    View Slide

  4. Early years
    Dispatch API Storage
    Python
    Node.js

    View Slide

  5. Invoice
    Generation
    Service

    View Slide

  6. Background
    Legal document

    View Slide

  7. Background
    Legal document
    Vary by country

    View Slide

  8. Background
    Legal document
    Vary by country
    Vary by business line

    View Slide

  9. Background
    Legal document
    Vary by country
    Vary by business line
    Triggered after every trip or food
    delivery

    View Slide

  10. Sample architecture
    Money
    System
    Cassandra
    Kafka Preprocess Render
    Kafka
    Consumer
    Object Store

    View Slide

  11. More than 30 upstream systems
    Large Scale

    View Slide

  12. More than 30 upstream systems
    More than 100 TBs of data stored
    Large Scale

    View Slide

  13. More than 30 upstream systems
    More than 100 TBs of data stored
    Running on 400 containers
    in multiple DCs
    Large Scale

    View Slide

  14. More than 30 upstream systems
    More than 100 TBs of data stored
    Running on 400 containers
    in multiple DCs
    Running for 5 years
    Large Scale

    View Slide

  15. More than 30 upstream systems
    More than 100 TBs of data stored
    Running on 400 containers
    in multiple DCs
    Running for 5 years
    99.999% availability for last 6 months
    Large Scale

    View Slide

  16. More than 30 upstream systems
    More than 100 TBs of data stored
    Running on 400 containers
    in multiple DCs
    Running for 5 years
    99.999% availability for last 6 months
    Implemented in Python
    Large Scale

    View Slide

  17. Sample architecture
    Money
    System
    Cassandra
    Kafka Preprocess Render
    Kafka
    Consumer
    Object Store
    Web API
    Hive

    View Slide

  18. Building blocks
    http://flask.pocoo.org

    View Slide

  19. Flask Example

    View Slide

  20. Flask Usage

    View Slide

  21. MVCS

    View Slide

  22. MVCS
    Controller
    Mapper
    Service Entities
    External
    Services
    Database

    View Slide

  23. Building blocks
    https://uwsgi-docs.readthedocs.io/

    View Slide

  24. uWSGI
    uwsgi
    python
    python
    python

    View Slide

  25. Building blocks
    http://www.celeryproject.org/

    View Slide

  26. Celery
    celery-worker
    celery-worker
    celery-worker
    kafka
    consumer Redis

    View Slide

  27. “Use the right tool
    for the job”

    View Slide

  28. It hurts velocity at some
    point

    View Slide

  29. What we need for each language?
    Training / Best practices /
    Documentation / Experts

    View Slide

  30. What we need for each language?
    Training / Best practices /
    Documentation / Experts
    Project template /
    Bootstrapping

    View Slide

  31. What we need for each language?
    Training / Best practices /
    Documentation / Experts
    Project template /
    Bootstrapping
    Configuration

    View Slide

  32. What we need for each language?
    Training / Best practices /
    Documentation / Experts
    Project template /
    Bootstrapping
    Configuration
    Debuggers

    View Slide

  33. What we need for each language?
    Training / Best practices /
    Documentation / Experts
    Project template /
    Bootstrapping
    Configuration
    Debuggers
    Profilers

    View Slide

  34. What we need for each language?
    Training / Best practices /
    Documentation / Experts
    Project template /
    Bootstrapping
    Configuration
    Debuggers
    Profilers
    Building, Packaging,
    Deploying

    View Slide

  35. We picked Go and Java

    View Slide

  36. Why Go?

    View Slide

  37. Broad applicability

    View Slide

  38. High performance

    View Slide

  39. Static typing

    View Slide

  40. Has momentum

    View Slide

  41. From Python to Go

    View Slide

  42. EAFP versus LBYL

    View Slide

  43. EAFP versus LBYL

    View Slide

  44. Dependency injection
    https://github.com/uber-go/fx

    View Slide

  45. Cadence instead of Celery
    https://github.com/uber/cadence

    View Slide

  46. Cadence
    Cadence
    DB queue
    Timers
    invoice
    service
    worker
    worker
    worker
    worker

    View Slide

  47. MVCS translates nicely

    View Slide

  48. How to migrate?
    Money
    System
    Invoice
    Generation
    Storage
    Python

    View Slide

  49. Option #1 - Big Bang Rewrite
    Money
    System
    Invoice
    Generation
    Storage
    Python
    Invoice
    Generation
    Go

    View Slide

  50. Option #1 - Big Bang Rewrite
    Money
    System
    Storage
    Invoice
    Generation
    Go

    View Slide

  51. No visibility on regressions

    View Slide

  52. No visibility on performance
    degradation

    View Slide

  53. No visibility on feature
    parity

    View Slide

  54. Option #2 - Do it iteratively

    View Slide

  55. Invoice
    Generation
    Storage
    Kafka

    View Slide

  56. Storage
    Kafka Preprocess Render

    View Slide

  57. Storage
    Kafka Preprocess Render
    Preprocess
    Go

    View Slide

  58. Storage
    Kafka Preprocess Render
    Preprocess
    Go
    Compare

    View Slide

  59. Storage
    Kafka Preprocess Render
    Preprocess
    Go
    Compare
    Toggle

    View Slide

  60. Volume?

    View Slide

  61. m3 DB
    https://www.m3db.io

    View Slide

  62. Tally - stats collection in Go
    https://github.com/uber-go/tally

    View Slide

  63. Tally - stats collection in Go

    View Slide

  64. Measure processing time
    p95, p99

    View Slide

  65. Storage
    Kafka Preprocess Render
    Preprocess
    Go
    Compare
    m3 Grafana

    View Slide

  66. Correctness?

    View Slide

  67. Storage
    Kafka Preprocess Render
    Preprocess
    Go
    Compare
    Kafka ELK

    View Slide

  68. View Slide

  69. Structured logging

    View Slide

  70. Structured logging

    View Slide

  71. Structured logging

    View Slide

  72. Structured logging

    View Slide

  73. Zap
    https://github.com/uber-go/zap

    View Slide

  74. Zap

    View Slide

  75. ELK

    View Slide

  76. Benefits of iterative approach
    Verify regressions
    Verify performance problems
    Verify feature parity

    View Slide

  77. Lessons learned

    View Slide

  78. Spend time to learn the new
    language

    View Slide

  79. Spend time to read code in
    the new language

    View Slide

  80. Do a rollout plan and stick
    to it

    View Slide

  81. Python can scale and is
    reliable

    View Slide

  82. Q&A
    Thank you!
    Nikolay Stoitsev, [email protected]

    View Slide

  83. View Slide