Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2017 - Christine Spang - Billions of Emails Nylas

PyBay
August 13, 2017

2017 - Christine Spang - Billions of Emails Nylas

Description
The open source Nylas Sync Engine provides a RESTful API on top of a powerful email sync platform, making it easy to build messaging into apps. It’s built using Python and gevent and has scaled to sync billions of messages over its lifetime deployment. In this talk, we’ll show you how it’s built and what technical challenges we’ve solved along the way.

Abstract
Why a sync engine?

If you’ve ever tried to build anything that works with email, you’ll find that it’s a problem full of twisty corners: the protocols themselves are obtuse and require entire RFCs just to describe how to implement sync with them, if you want your integration to work with everyone’s email you face implementing several different protocols or flavours of protocols (IMAP with CONDSTORE, IMAP with no CONDSTORE, Gmail IMAP, Exchange Web Services, Exchange ActiveSync, Office365 REST) plus OAuth authentication for different providers, and once you’ve gotten data flowing you still need to handle parsing email, which involves a complex format known as MIME as well as pretty much every way of encoding non-ASCII text as ASCII that has ever been invented.

We’ve built a platform that layers a sync engine over 30 years of email history and allows developers to read and write to mailboxes and calendars using a modern REST API. It’s not just a simple proxy that makes calls to IMAP or Exchange behind the scenes: in order to meet the speed and reliability demands our customers require, when a user connects their email account to a developer’s app, our infrastructure syncs a copy of that mailbox and keeps it up-to-date as changes are made from that app or traditional web, mobile, and desktop email clients. This is a demanding technical challenge and wasn’t easy to build.

How a sync engine?

A semi-monolithic application composed of several services that all share a common database and a fair amount of code, but run on separate server fleets (email sync, api frontend, webhooks, etc.)

~90k lines of Python, including tests and migrations

MySQL: one sharded database and one global database

Major libraries we use: Flask, gevent, SQLAlchemy, pytest In production: haproxy, nginx+gunicorn w/gevent pywsgi adapter.

Technical challenges (so far!)

What are the major problems that we’ve solved?

A universal API across providers
Philosophy: whenever possible, unify the API across providers
We should allow developers to build one integration, not many
A few exceptions: Folders vs labels
Transactions, delta streaming & webhooks
Capturing mailbox changes & allowing apps to subscribe to them
For now: SQLAlchemy events & MySQL are the backbone
Error handling & retries using gevent
Wrapping greenlets to implement backoff
Saving & aggregating errors
Sharded data store
How we split data across multiple MySQL clusters
Performance instrumentation
Extensive custom instrumentation built on top of greenlets
Available for you to use: nylas-perftools
Load balancing
Mail accounts are heterogeneous: different protocols, sizes, rates of new mail receipt…
How to distribute load across a fleet of servers & keep them balanced?
The future

mypy, Python 3, Kafka, more flexible MySQL clusters, and beyond!

Bio
Christine went to MIT, dropped out of an operating systems graduate program to be an early engineer at Ksplice, and most recently cofounded Nylas, a startup building an email platform. When she's not building rock-solid infrastructure for the Internet or speaking around the world at conferences like DebConf and PyCon, rumour has it she can be found on cliff walls, remote trails, and dance floors. She lives in Oakland, California.

PyBay

August 13, 2017
Tweet

More Decks by PyBay

Other Decks in Programming

Transcript

  1. What we’re going to talk about today • What does

    the Nylas do & why did we build a sync engine? • Technical Architecture & Stack • Technical Challenges • What’s next?
  2. What we’re going to talk about today • What does

    the Nylas do & why did we build a sync engine? • Technical Architecture & Stack • Technical Challenges • What’s next?
  3. Why? Works pretty OK when... • You know what the

    email you’re parsing looks like • You’re only working with a single email provider => Highly constrained!
  4. Why? Authentication complexity • No standard for email address =>

    provider settings • OAuth2 or password authentication • Error messages are not standardized
  5. Why? Protocol complexity • IMAP is a TCP protocol &

    has many server implementations • extensions, Gmail labels, server-dependent errors • Exchange ActiveSync (WBXML) • Exchange Web Services (SOAP) • Is there even a library for that in $LANG? Is it any good?
  6. Why? Parsing complexity • Many specs • Messages are encoded

    on clients & sometimes clients violate the specs • MIME, base64, 7bit & 8bit, quoted-printable, plaintext or HTML, folded & encoded-words headers, attachments...
  7. Why? Sending complexity • For IMAP servers at least, you

    have to use a different protocol to send email (SMTP) • Sometimes integrated w/IMAP, sometimes not (no way to find out but try it & see) • Exchange servers rewrite input & can mangle
  8. Why? Integrating gets harder when... • You have to parse

    & filter many, non-specific emails • You need compatibility with many different email providers ugh!
  9. 2 4 The Nylas Sync Engine & API: A Modern

    REST API for Email, Contacts, & Calendar
  10. What we’re going to talk about today • What does

    Nylas do & why did we build a sync engine? • Technical Architecture & Stack • Technical Challenges • What’s next?
  11. Tech Stack • ~80,000 lines of Python 2.7 • Flask,

    gevent, SQLAlchemy, pytest • HAproxy -> nginx -> gunicorn (w/gevent-pywsgi) • MySQL (mostly primary-replica clusters we manage on EC2) • ProxySQL • Ansible
  12. Tech Stack “Let's say every company gets about three innovation

    tokens. You can spend these however you want, but the supply is fixed for a long while.” — @mcfunley http://mcfunley.com/choose-boring-technology
  13. Architecture Two possible strategies: • Store minimal data & proxy

    requests to upstream providers • Mirror contents of mailboxes & serve most requests directly
  14. Architecture Two possible strategies: • Store minimal data & proxy

    requests to upstream providers • Mirror contents of mailboxes & serve most requests directly Reliability & Speed!
  15. Architecture: A semi-monolithic application Global DB Sharded DB Sharded DB

    Sharded DB Redis Redis ProxySQL ProxySQL ProxySQL ProxySQL ProxySQL ProxySQL ProxySQL ProxySQL Sync fleet API fleet haproxy clients
  16. What we’re going to talk about today • What does

    Nylas do & why did we build a sync engine? • Technical Architecture & Stack • Technical Challenges • What’s next?
  17. API Philosophy Our clients should build one integration, not many.

    That means we must build a unified API that is consistent across email providers.
  18. Database Sharding • People have a lot of email. One

    of the first scaling challenges we had to solve was data storage. • Our primary data store is sharded using MySQL autoincrements on primary keys. • https://www.nylas.com/blog/growing-up-with-mysql/
  19. MySQL Transaction Log • We record changes to mailboxes in

    a table as we sync them • Translates document store => changesets for easier sync • Powers webhooks, streaming API, internal services
  20. Architecture: MySQL transaction log Sharded DB ProxySQL ProxySQL ProxySQL ProxySQL

    Sync fleet ProxySQL ProxySQL Webhooks fleet Transaction table Sharded DB Transaction table
  21. MySQL Transaction Log Why MySQL?? • It’s technically not the

    right tool for the job • … but it was one less thing to set up, maintain, learn • Can write entries in same SQL transaction & guarantee atomicity
  22. MySQL Transaction Log The Future • With MySQL, all clients

    must poll to get updates • Excessive locking, DB load … expensive • The right tool in 2017 is probably Kafka • Starting early experimentation now!
  23. Architecture: Sync Fleet Avoid the GIL: Use multiple processes on

    multicore machines! sync-1 sync-3 sync-4 sync-2 sync-1 sync-3 sync-4 sync-2 sync-1 sync-3 sync-4 sync-2 EC2 instance EC2 instance EC2 instance
  24. Sync Processes • Gevent to sync multiple accounts on a

    single process • ~100 accounts per process • Minimizes overhead from open sockets to IMAP providers
  25. Architecture: Sync Process Sync Service Gmail Sync Exchange Sync All

    Mail Trash Calendar Contacts Inbox Folder 1 Contacts Contacts
  26. Sync Load Balancing • Mailboxes are heterogeneous (different providers, different

    protocols, different sizes & rate of new mail receipt…) • Can’t easily predict how “expensive” it will be to sync • Measure time spent active in greenlets for each account & run manual load balances
  27. Greenlet Instrumentation • 3+ greenlets per account • ~100 accounts

    per process • 16 processes per machine • Dozens of machines
  28. Greenlet Instrumentation • Greenlet scheduling is cooperative • Watch out

    for non-cooperative behaviour! • Excessive parallelism can cause delays in greenlet execution • Separate thread on each sync process that serves stack samples for generating flame graphs for profiling • https://github.com/nylas/nylas-perftools
  29. What we’re going to talk about today • What does

    the Nylas Sync Engine do & why did we build it? • Technical Architecture & Stack • Technical Challenges • What’s next?
  30. What’s next? • Full mypy coverage • Python 3 •

    Kafka event backbone • Better load balancing, enhanced webhooks, contacts & calendar features, observability, infosec...
  31. 5 2 Thank you! •Nylas team, past & present •Mailgun

    team, Menno Smits, & other authors of libraries we depend on •Python core contribs https://nylas.com https://github.com/nylas/sync- engine