Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a Data-Driven Web App that Everyone Ca...

Building a Data-Driven Web App that Everyone Can Use

Presented in PyCon APAC 2018

-

You're a data scientist with a machine learning model that you want to show everyone. Do you give your users your Python scripts and tell them to run "python mycoolmodel.py"? Is there a better alternative? How about a web app? The speaker will show you how Flask can be the best fit *pun intended* for this case.

Galuh Sahid

June 01, 2018
Tweet

More Decks by Galuh Sahid

Other Decks in Programming

Transcript

  1. If you want to build one yourself, there are so

    many things that you have to learn
  2. If you want to build one yourself, there are so

    many things that you have to learn Myth
  3. + +

  4. About our web app • Allows users to input their

    own data • Displays data from an outside source (e.g. a third-party API) • Displays the prediction result of a model we've trained previously • Displays a graph that is dynamic--based on the user input
  5. About our web app • Allows users to input their

    own data • Displays data from an outside source (e.g. a third-party API) • Displays the prediction result of a model we've trained previously • Displays a graph that is dynamic--based on the user input Works as a prototype, demo, or simple minimum viable product (MVP) (making a scalable web app is a whole different story!)
  6. from flask import Flask app = Flask(__name__) @app.route('/') def index():

    return "Hello, world!" app.py $ export FLASK_APP=app.py $ flask run * Running on http:// 127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat Flask
  7. from flask import Flask app = Flask(__name__) @app.route('/') def index():

    return "Hello, world!" @app.route('/result') def get_result(): result = 100000 return str(result) app.py
  8. from flask import Flask, render_template app = Flask(__name__) @app.route('/') def

    index(): return render_template("index.html") @app.route('/result') def get_result(): result = 100000 return str(result) app.py <h1>Hello, world!</h1> templates/index.html
  9. from flask import Flask, render_template app = Flask(__name__) @app.route('/') def

    index(): return render_template("index.html") @app.route('/result') def get_result(): result = 100000 return render_template("result.html", result=result) app.py <h1>{{ result }}</h1> templates/result.html
  10. <h1>Campaign name:</h1> <form action="/result" method="GET"> <input name="campaign" type="text" required></input> </form>

    templates/index.html Getting user input 127.0.0.1:500/result?campaign=some-campaign
  11. from flask import Flask, render_template, request ... @app.route('/result') def get_result():

    campaign = request.args.get('campaign', None) return render_template("result.html", campaign=campaign) app.py Getting user input <h1>{{ campaign }}</h1> app.py 127.0.0.1:5000/result?campaign=some-campaign query string
  12. import json import urllib ... @app.route('/result') def get_result(): campaign =

    request.args.get('campaign', None) base_url = 'http://example.com/api/v1/ campaigns?id=' url = "{}/{}".format(base_url, campaign) response = urllib.urlopen(url) data = json.loads(response.read()) return render_template("result.html", data=data) app.py Getting some data
  13. What to do with our model? • We only want

    to use one model with the best accuracy/ smallest error
  14. What to do with our model? • We only want

    to use one model with the best accuracy/ smallest error • We don't want to retrain our model every time a new request hits our web application
  15. Make it persistent! What to do with our model? •

    pickle: built-in Python module to serialize and de-serialize a Python object structure • joblib: a library that provides utilities for pipelining Python jobs • Or we can use each library's specific method (libsvm: svm_save_model and svm_load_model, TensorFlow: tf.train.Saver() class)
  16. Making our model persistent from sklearn.ensemble import RandomForestRegressor from sklearn.externals

    import joblib rf = RandomForestRegressor(n_estimators=300) rf.fit(X_train, y_train) joblib.dump(rf, 'model.sav') Your original Python script/Jupyter Notebook
  17. import json import pandas as pd from sklearn.externals import joblib

    def get_prediction(data): df_data = pd.DataFrame([data]).astype(float) model = joblib.load("model.sav") predicted_amount = model.predict(df_data)[0] target_amount = data["target_amount"] if (target_amount > predicted_amount): is_funded = False else: is_funded = True prediction = {"amount": predicted_amount, "is_funded": is_funded} return json.dumps(prediction) model.py Making our model persistent
  18. from model import get_prediction ... @app.route('/result') def get_result(): ... data

    = json.loads(response.read()) prediction = json.loads(get_prediction(data)) return render_template("result.html", data=data, prediction=prediction) app.py Making our model persistent
  19. Some pitfalls • Make sure you're loading trusted data •

    Saving a model using a particular version of a library and loading it using another version might give unexpected results
  20. Some pitfalls • Make sure you're loading trusted data •

    Saving a model using a particular version of a library and loading it using another version might give unexpected results So what to do? Keep your: • Training data • Source code that generates the model • Version of the library used • Dependencies used • Cross validation score obtained
  21. import matplotlib.pyplot as plt y = [1, 2, 3, 4,

    5] x = [0, 2, 1, 3, 4] plt.plot(x, y) Your original Python script/Jupyter Notebook Data visualization
  22. Data visualization import matplotlib.pyplot as plt import StringIO import base64

    import json def get_plot_url(fb_shares): y = [1, 2, 3, 4, fb_shares] x = [0, 2, 1, 3, 4] plt.plot(x, y) img = StringIO.StringIO() plt.savefig(img, format='png') img.seek(0) plot_url = base64.b64encode(img.getvalue()) return json.dumps({'plot_url': plot_url}) graph.py
  23. <head> <title>Campaign Success Estimator</title> </head> <body> <h1>{{ data["id"] }}</h1> <h2>Statistics</h2>

    <ul> <li><strong>Story word count:</strong> {{ data["story_word_count"] }}</li> <li><strong>Number of images:</strong> {{ data["number_of_images"] }}</li> <li><strong>Number of videos:</strong> {{ data["number_of_videos"] }}</li> <li><strong>Number of Facebook shares:</strong> {{ data["number_of_fb_shares"] }}</li> <li><strong>Target amount:</strong> {{ data["target_amount"] }}</li> </ul> <h2>Prediction</h2> <strong>Predicted amount:</strong> {{ prediction["amount"] }} </body> templates/result.html Putting everything together
  24. <head> <title>Campaign Success Estimator</title> </head> <body> ... <img src="data:image/png;base64, {{

    graph['plot_url'] }}" /> </body> templates/result.html Putting everything together
  25. Flask's templating engine: Jinja2 <title>{% block title %}{% endblock %}</title>

    <ul> {% for user in users %} <li><a href="{{ user.url }}">{{ user.username }} </a></li> {% endfor %} </ul>
  26. Conditionals <head> <title>Campaign Success Estimator</title> <style> html { font-family: "Arial"

    } .prediction { margin: 10px; } .prediction .funded { color: #2ecc71; } .prediction .not-funded { color: #e74c3c; } </style> </head> ... templates/result.html
  27. Conditionals ... <div class="prediction"> <strong>Predicted amount: </strong> <span class="prediction {{'funded'

    if prediction['is_funded'] else 'not-funded'}}"> {{ prediction["amount"] }} </span> </div> ... templates/result.html <span class="prediction funded"> if is_predicted returns True <span class="prediction not-funded"> if is_predicted returns Frue
  28. Custom filters ... <li><strong>Target amount:</strong> {{ data["target_amount"]| format_currency }}</li> </ul>

    <h2>Prediction</h2> <div class="prediction"> <strong>Predicted amount: </strong> <span class="prediction {{'funded' if prediction['is_predicted'] else 'not-funded'}}"> {{ prediction["amount"]|format_currency }} </span> </div> ... templates/result.html
  29. What's next? • Deploy it and share it with the

    world: • Heroku • Google App Engine • And many other options • Add some more functionalities: • Flask Admin • Flask Login • ... and so on
  30. What's next? • Make it more interactive • react-flask •

    react-redux-flask • flask-vuejs • flask + d3.js • Make it more scalable