Upgrade to PRO for Only $50/Yearโ€”Limited-Time Offer! ๐Ÿ”ฅ

Flask At Scale

Flask Atย Scale

Do you think that because Flask is a micro-framework, it must only be good for small, toy-like web applications? Well, not at all! In this tutorial I am going to show you a few patterns and best practices that can take your Flask application to the next level.

This presentation was given as a tutorial at PyCon 2016.

Avatar for Miguel Grinberg

Miguel Grinberg

May 28, 2016
Tweet

More Decks by Miguel Grinberg

Other Decks in Programming

Transcript

  1. About Me โ€ข Full-Stack Engineer at โ€ข Oโ€™Reillyโ€™s Flask Web

    Development โ€ข The Flask Mega-Tutorial โ€ข blog.miguelgrinberg.com โ€ข A bunch of open source packages
  2. Some Initial Thoughts โ€ข Can Flask Scale? Wrong question! โ€ข

    Flask is not at the center of the world, and that is a good thing. โ€ข Change is unavoidable, so better make it part of your workflow. โ€ข The best Flask boilerplate/starter project is...
  3. The Ultimate Flask Boilerplate ;-) from flask import Flask app

    = Flask(__name__) @app.route('/') def hello(): return 'Hello World!' if __name__ == '__main__': app.run()
  4. Slack? Nope, Itโ€™s Flack! v0.1 (Try it yourself: bit.ly/flackchat) โ€ข

    Lame attempt at a chat service โ€ข Flask API Backend โ—ฆ User registration: POST request to /api/users โ—ฆ Token request: POST request to /api/tokens (basic auth required) โ—ฆ Get users: GET request to /api/users?updated_since=t (token optional) โ—ฆ Get messages: GET request to /api/messages?updated_since=t (token optional) โ—ฆ Post message: POST request to /api/messages (token required) โ—ฆ Messages are written in markdown. Links are scraped and expanded. โ—ฆ Unit test suite with code coverage and code linting. โ€ข Backbone JavaScript Client (Backbone??? Are we in 2013 or something?)
  5. Slack? Nope, Itโ€™s Flack! v0.1 (Try it yourself: bit.ly/flackchat) flack/

    โ”œโ”€โ”€ flack.py โ”œโ”€โ”€ templates/ | โ””โ”€โ”€ index.html โ”œโ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  6. How to Work with the Code โ€ข Git repository: https://github.com/miguelgrinberg/flack

    โ€ข Incremental versions are tagged: v0.1, v0.2, etc. โ€ข Some commands to get you started: โ—ฆ git checkout <version-tag> โ† gets a specific version โ—ฆ pip install -r requirements.txt โ† installs dependencies โ—ฆ python flack.py โ† runs webserver (early versions) โ€ข To start client: Visit http://<ip-address>:5000 on your browser
  7. Whatโ€™s Wrong with Flack v0.1? โ€ข Development โ—ฆ The whole

    backend is in a single, huge Python module. โ—ฆ Unit tests use a couple of hacks to configure the application properly. โ—ฆ Only way to apply configuration settings is via environment variables or by editing code. โ€ข Production โ—ฆ There is no production web server strategy. โ—ฆ Messages are rendered during the processing of the request synchronously. โ—ฆ Clients have to poll the API very frequently to provide a โ€œreal-timeโ€ feel.
  8. Refactoring Utility Functions v0.2 โ€ข Auxiliary functions that perform self-contained

    tasks can be easily moved to separate module(s). flack.py from utils import timestamp timestamp() utils.py def timestamp(): pass
  9. Refactoring Utility Functions v0.2 flack/ โ”œโ”€โ”€ flack.py โ”œโ”€โ”€ utils.py โ”œโ”€โ”€

    templates/ | โ””โ”€โ”€ index.html โ”œโ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  10. โ€ข Two modules that import symbols from each other are

    a recipe for disaster. This breaks horribly, but probably not how you think it does: Refactoring Database Models v0.3 flack.py from models import User db = SQLAlchemy(app) def new_user(): u = User() models.py from flack import db class User(db.Model): pass
  11. โ€ข Solution #1: move imports down on the application side.

    โ€ข Solution #2: Deal with __main__ issues as best as possible. Refactoring Database Models v0.3 flack.py db = SQLAlchemy(app) from models import User def new_user(): u = User() models.py try: from __main__ import db except ImportError: from flack import db class User(db.Model): pass
  12. Refactoring Database Models v0.3 flack/ โ”œโ”€โ”€ flack.py โ”œโ”€โ”€ models.py โ”œโ”€โ”€

    utils.py โ”œโ”€โ”€ templates/ | โ””โ”€โ”€ index.html โ”œโ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  13. Creating an Application Package v0.4 โ€ข Avoids the issues with

    __main__ โ€ข Code, templates and static files all move together inside the package. โ€ข The application package can export just the symbols that are needed outside (app and db). โ€ข A more robust start-up script can be built (Flask-Script, click, etc.). โ€ข The start-up script can include maintenance operations: โ—ฆ manage.py runserver โ† Runs the Flask development web server โ—ฆ manage.py shell โ† Starts a Python console with a Flask app context โ—ฆ manage.py createdb โ† Creates the applicationโ€™s database
  14. Creating an Application Package v0.4 flack/ โ”œโ”€โ”€ flack/ | โ”œโ”€โ”€

    __init__.py | โ”œโ”€โ”€ flack.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ tests.py โ”œโ”€โ”€ manage.py โ† runserver, shell and createdb commands available here โ””โ”€โ”€ requirements.txt
  15. Refactoring API Authentication v0.5 โ€ข This is an similar to

    how the models were moved. โ€ข Circular dependencies are handled by putting the imports after the database and models are initialized.
  16. Refactoring API Authentication v0.5 flack/ โ”œโ”€โ”€ flack/ | โ”œโ”€โ”€ __init__.py

    | โ”œโ”€โ”€ flack.py | โ”œโ”€โ”€ auth.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ tests.py โ”œโ”€โ”€ manage.py โ””โ”€โ”€ requirements.txt
  17. Refactoring Tests v0.6 โ€ข Moving tests to a package helps

    keep growing test suites organized. โ€ข The manage.py launcher script can be extended even more: โ—ฆ manage.py test โ† launches tests โ—ฆ manage.py lint โ† runs code linter
  18. Refactoring Tests v0.6 flack/ โ”œโ”€โ”€ flack/ | โ”œโ”€โ”€ __init__.py |

    โ”œโ”€โ”€ flack.py | โ”œโ”€โ”€ auth.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ manage.py โ† test and lint commands added here โ”œโ”€โ”€ tests/ | โ”œโ”€โ”€ __init__.py | โ””โ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  19. Refactoring Configuration v0.7 โ€ข Putting the configuration in its own

    module helps organize different configuration sets (development, production, testing). โ€ข The desired configuration is given in the FLACK_CONFIG environment variable. โ€ข A bit less hacky to get unit tests to run on a different database.
  20. Refactoring Configuration v0.7 flack/ โ”œโ”€โ”€ flack/ | โ”œโ”€โ”€ __init__.py |

    โ”œโ”€โ”€ flack.py | โ”œโ”€โ”€ auth.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ config.py โ”œโ”€โ”€ manage.py โ”œโ”€โ”€ tests/ | โ”œโ”€โ”€ __init__.py | โ””โ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  21. โ€ข Refactoring the API endpoints into a blueprint helps modularize

    the application. But, there are more cyclic dependencies to sort out. Creating an API Blueprint v0.8 flack/flack.py app = Flask(__name__) db = SQLAlchemy(app) from .api import api as api_blueprint app.register_blueprint(api_blueprint, url_prefix='/api') flack/api.py from .flack import db api = Blueprint('api', __name__) @api.route('/users', methods=['POST']) def new_user(): pass
  22. Creating an API Blueprint v0.8 flack/ โ”œโ”€โ”€ flack/ | โ”œโ”€โ”€

    __init__.py | โ”œโ”€โ”€ flack.py โ† blueprint is initialized here | โ”œโ”€โ”€ auth.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ api.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ config.py โ”œโ”€โ”€ manage.py โ”œโ”€โ”€ tests/ | โ”œโ”€โ”€ __init__.py | โ””โ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  23. Refactoring Request Stats v0.9 โ€ข The code that reports request

    stats can easily be moved to a separate module. Its configuration can be added to the applicationโ€™s config object. flack/flack.py app = Flask(__name__) from . import stats flack/stats.py from .flack import app request_stats = [] def requests_per_second(): return len(request_stats) / app.config['REQUEST_STATS_WINDOW']
  24. Refactoring Request Stats v0.9 flack/ โ”œโ”€โ”€ flack/ | โ”œโ”€โ”€ __init__.py

    | โ”œโ”€โ”€ flack.py | โ”œโ”€โ”€ auth.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ api.py | โ”œโ”€โ”€ stats.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ config.py โ”œโ”€โ”€ manage.py โ”œโ”€โ”€ tests/ | โ”œโ”€โ”€ __init__.py | โ””โ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  25. Using an Application Factory Function v0.10 โ€ข Sometimes it is

    desirable to work with more than one application. โ€ข Best example: unit tests that need applications with different configurations.
  26. Using an Application Factory Function v0.10 โ€ข Flask extensions can

    use an app specific initialization inside the factory function via the init_app() method. flack/__init__.py db = SQLAlchemy() def create_app(config_name=None): app = Flask(__name__) app.config.from_object(config[config_name]) db.init_app(app) # ... return app
  27. Using an Application Factory Function v0.10 โ€ข Not having a

    global app means a number of things need to change: โ—ฆ The app.route decorator cannot be used, so all endpoints need to be moved to blueprints. โ—ฆ Any references to app (such as app.config[...]) need to be removed. โ—ฆ Use the current_app context variable to access the application. โ—ฆ Manually push the app context when working outside of a request (such as in a background thread).
  28. Using an Application Factory Function v0.10 flack/ โ”œโ”€โ”€ flack/ |

    โ”œโ”€โ”€ __init__.py โ† application factory function is here | โ”œโ”€โ”€ flack.py โ† endpoints that serve client application moved to main blueprint; app context used in thread | โ”œโ”€โ”€ auth.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ api.py | โ”œโ”€โ”€ stats.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ config.py โ”œโ”€โ”€ manage.py โ”œโ”€โ”€ tests/ | โ”œโ”€โ”€ __init__.py | โ””โ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  29. Creating an API Package v0.11 โ€ข Replacing the API module

    with a package leaves more space for growth by having a module per resource. flack/api/__init__.py from flask import Blueprint api = Blueprint('api', __name__) from . import tokens, users, messages flack/api/tokens.py from . import api @api.route('/tokens', methods=['POST']) def new_token(): pass
  30. Creating an API Package v0.11 flack/ โ”œโ”€โ”€ flack/ | โ”œโ”€โ”€

    __init__.py | โ”œโ”€โ”€ flack.py | โ”œโ”€โ”€ auth.py | โ”œโ”€โ”€ models.py | โ”œโ”€โ”€ utils.py | โ”œโ”€โ”€ api/ | | โ”œโ”€โ”€ __init__.py | | โ”œโ”€โ”€ tokens.py | | โ”œโ”€โ”€ messages.py | | โ”œโ”€โ”€ users.py | โ”œโ”€โ”€ stats.py | โ”œโ”€โ”€ templates/ | | โ””โ”€โ”€ index.html | โ””โ”€โ”€ static/ | โ””โ”€โ”€ client-side js and css files โ”œโ”€โ”€ config.py โ”œโ”€โ”€ manage.py โ”œโ”€โ”€ tests/ | โ”œโ”€โ”€ __init__.py | โ””โ”€โ”€ tests.py โ””โ”€โ”€ requirements.txt
  31. Whatโ€™s Next? โ€ข Refactoring as shown can go on as

    the application continues to evolve โ€ข Examples: โ—ฆ models.py can become a package, with a module per model inside. โ—ฆ The api package can have sub-packages with different API versions. โ—ฆ The client side application can be moved into a separate project.
  32. Scaling Web Servers โ€ข Multiple threads โ—ฆ Limited use of

    multiple CPUs due to the GIL. โ—ฆ Application might need to synchronize access to shared resources. โ€ข Multiple processes โ—ฆ Great way to take advantage of multiple CPUs. โ—ฆ Synchronization problems are less common than with threads. โ€ข Green threads/coroutines (eventlet, gevent) โ—ฆ Extremely lightweight; hundreds/thousands of threads have small impact. โ—ฆ Cooperative multitasking makes synchronization much easier to manage. โ—ฆ Non-blocking I/O and threading functions. โ—ฆ I/O and threading functions in the standard library are incompatible.
  33. Using Production Web Servers v0.12 โ€ข Gunicorn โ—ฆ Written in

    Python, fairly robust, easy to use. โ—ฆ Supports multiple processes, and eventlet or gevent green threads. โ—ฆ Limited load balancer โ€ข Uwsgi โ—ฆ Written in C, very fast, extensive and somewhat hard to configure. โ—ฆ Supports multiple threads, multiple processes and gevent green threads. โ€ข Nginx โ—ฆ Written in C, very fast. โ—ฆ Ideal to serve static files in production, bypassing Python and Flask. โ—ฆ Great as reverse proxy and load balancer in front of gunicorn/uwsgi servers.
  34. Bottlenecks: I/O-Bound vs. CPU-Bound โ€ข I/O Bottlenecks โ—ฆ Flack example:

    scraping of links included in posts. โ—ฆ Solutions โ–ช Concurrent request handlers through multiple threads, processes or green threads. โ–ช Make I/O heavy requests asynchronous. โ€ข CPU Bottlenecks โ—ฆ Flack example: markdown rendering of posts. โ—ฆ Solutions โ–ช Make CPU intensive requests asynchronous and offload the CPU heavy tasks to auxiliary threads or processes to keep the server unblocked.
  35. Asynchronous HTTP Requests โ€ข The request should start the actual

    task in the background and return. โ€ข The status code in the response should be 202 (Accepted). โ€ข The Location header should include a URL where the client can ask for status for the asynchronous task. โ€ข Requests sent to the status URL should continue to return 202 while the background task is still in progress. The response body can include progress updates if desired. โ€ข After the background task is finished, the status URL should return the response from the task, as it would have been returned by a synchronous version of the request.
  36. Asynchronous Flask Requests v0.13 โ€ข The simplest approach is to

    run lengthy tasks in a background thread. โ€ข An awesome decorator can be built to do this transparently for Flask. synchronous... @api.route('/messages', methods=['POST']) @token_auth.login_required def new_message(): # ... asynchronous!!! @api.route('/messages', methods=['POST']) @token_auth.login_required @async def new_message(): # ...
  37. Celery Workers v0.14 โ€ข Sometimes it is desirable to have

    a fixed pool of workers dedicated to running asynchronous tasks. โ€ข Celery runs a pool of worker processes that listen for tasks provided by the main process. The processes communicate through a message queue (Redis, RabbitMQ, etc.). โ€ข The async decorator can be modified to send tasks to Celery. No code changes to the application required! โ€ข To start the celery worker processes, use ./manage.py celery
  38. Scaling with nginx and Celery client server server server server

    server nginx (https โ†’ http) celery worker celery worker celery worker celery client msg queue celery worker celery worker celery client celery client celery client celery client database
  39. Battling Request/Response โ€œChurnโ€ โ€ข With REST, clients are forced to

    poll to stay updated, adding extra load. โ€ข Switching to a โ€œserver-pushโ€ model can help. โ—ฆ Option #1: Streaming โ—ฆ Option #2: Long-polling โ—ฆ Option #3: WebSocket โ—ฆ Option #4: Socket.IO (long-polling + WebSocket)
  40. Socket.IO Server v0.15 โ€ข Server-push with Socket.IO Server (Python) def

    push_model(model): socketio.emit('updated_model', { 'class': model.__class__.__name__, 'model': model.to_dict() }) Client (JavaScript) socket.on('updated_model', function(data) { if (data['class'] == 'User') { updateUser(data.model); } else if (data['class'] == 'Message') { updateMessage(data.model); } });
  41. Socket.IO Server v0.15 โ€ข Clients can push to the server

    too! Client (JavaScript) socket.emit('post_message', {source: args.message}, token) Server (Python) @socketio.on('post_message') def on_post_message(data, token): verify_token(token, add_to_session=True) msg = Message.create(data) # โ€ฆ write message to the database push_model(msg)
  42. Socket.IO Server v0.15 โ€ข No need to poll to find

    disconnected users! โ€ข To identify the user we use the Flask user session. Server (Python) @socketio.on('disconnect') def on_disconnect(): nickname = session.get('nickname') if nickname: user = User.query.filter_by(nickname=nickname).first() user.online = False # โ€ฆ write user to the database push_model(user)
  43. Socket.IO + Celery v0.16 โ€ข Like request handlers, Socket.IO event

    handlers cannot be CPU heavy. โ€ข Celery saves the day again! Socket.IO event handler @socketio.on('post_message') def on_post_message(data, token): verify_token(token) if g.current_user: post_message.apply_async( args=(g.current_user.id, data)) Celery task @celery.task def post_message(user_id, data): from .wsgi_aux import app with app.app_context(): u = User.query.get(user_id).first() msg = Message.create(data, u) # โ€ฆ write message to the database push_model(msg) if msg.expand_links(): push_model(msg)
  44. Scaling with nginx, Celery and Flask-SocketIO client server server server

    server server nginx (https โ†’ http) (wss โ†’ ws) celery worker celery worker celery worker celery client msg queue celery worker celery worker celery client celery client celery client celery client socket.io socket.io socket.io socket.io socket.io socket.io* socket.io* socket.io* socket.io* socket.io* database