Building a Data-Driven Web App that Everyone Can Use

Building a Data-Driven Web App that Everyone Can Use

Presented in PyCon APAC 2018

-

You're a data scientist with a machine learning model that you want to show everyone. Do you give your users your Python scripts and tell them to run "python mycoolmodel.py"? Is there a better alternative? How about a web app? The speaker will show you how Flask can be the best fit *pun intended* for this case.

2b6d7bdd43058e87f53866eb86538a59?s=128

Galuh Sahid

June 01, 2018
Tweet

Transcript

  1. Building a Data-Driven Web App That Everyone Can Use galuh.me

    | @galuhsahid
  2. What is a data-driven web application?

  3. Let's start with a problem

  4. None
  5. None
  6. None
  7. None
  8. $ git clone $ jupyter notebook $ python mycoolmodel.py

  9. $ git clone $ jupyter notebook $ python mycoolmodel.py

  10. Web applications are super cool

  11. It takes a lot of people to build one

  12. It takes a lot of people to build one Myth

  13. If you want to build one yourself, there are so

    many things that you have to learn
  14. If you want to build one yourself, there are so

    many things that you have to learn Myth
  15. You can build one yourself Fact

  16. You can build one yourself Fact ... and it doesn't

    have to take forever
  17. + +

  18. About our web app • Allows users to input their

    own data • Displays data from an outside source (e.g. a third-party API) • Displays the prediction result of a model we've trained previously • Displays a graph that is dynamic--based on the user input
  19. About our web app • Allows users to input their

    own data • Displays data from an outside source (e.g. a third-party API) • Displays the prediction result of a model we've trained previously • Displays a graph that is dynamic--based on the user input Works as a prototype, demo, or simple minimum viable product (MVP) (making a scalable web app is a whole different story!)
  20. From something like this: To something like this:

  21. from flask import Flask app = Flask(__name__) @app.route('/') def index():

    return "Hello, world!" app.py $ export FLASK_APP=app.py $ flask run * Running on http:// 127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat Flask
  22. from flask import Flask app = Flask(__name__) @app.route('/') def index():

    return "Hello, world!" app.py
  23. from flask import Flask app = Flask(__name__) @app.route('/') def index():

    return "Hello, world!" @app.route('/result') def get_result(): result = 100000 return str(result) app.py
  24. from flask import Flask, render_template app = Flask(__name__) @app.route('/') def

    index(): return render_template("index.html") @app.route('/result') def get_result(): result = 100000 return str(result) app.py <h1>Hello, world!</h1> templates/index.html
  25. from flask import Flask, render_template app = Flask(__name__) @app.route('/') def

    index(): return render_template("index.html") @app.route('/result') def get_result(): result = 100000 return render_template("result.html", result=result) app.py <h1>{{ result }}</h1> templates/result.html
  26. Getting user input

  27. <h1>Campaign name:</h1> <form action="/result" method="GET"> <input name="campaign" type="text" required></input> </form>

    templates/index.html Getting user input 127.0.0.1:500/result?campaign=some-campaign
  28. from flask import Flask, render_template, request ... @app.route('/result') def get_result():

    campaign = request.args.get('campaign', None) return render_template("result.html", campaign=campaign) app.py Getting user input <h1>{{ campaign }}</h1> app.py 127.0.0.1:5000/result?campaign=some-campaign query string
  29. Getting some data

  30. Getting some data example.com/api/v1/campaigns?name=some-campaign

  31. import json import urllib ... @app.route('/result') def get_result(): campaign =

    request.args.get('campaign', None) base_url = 'http://example.com/api/v1/ campaigns?id=' url = "{}/{}".format(base_url, campaign) response = urllib.urlopen(url) data = json.loads(response.read()) return render_template("result.html", data=data) app.py Getting some data
  32. {{ data }} result.html Getting some data

  33. What to do with our model?

  34. What to do with our model?

  35. What to do with our model? • We only want

    to use one model with the best accuracy/ smallest error
  36. What to do with our model? • We only want

    to use one model with the best accuracy/ smallest error • We don't want to retrain our model every time a new request hits our web application
  37. Make it persistent! What to do with our model? •

    pickle: built-in Python module to serialize and de-serialize a Python object structure • joblib: a library that provides utilities for pipelining Python jobs • Or we can use each library's specific method (libsvm: svm_save_model and svm_load_model, TensorFlow: tf.train.Saver() class)
  38. Making our model persistent from sklearn.ensemble import RandomForestRegressor from sklearn.externals

    import joblib rf = RandomForestRegressor(n_estimators=300) rf.fit(X_train, y_train) joblib.dump(rf, 'model.sav') Your original Python script/Jupyter Notebook
  39. import json import pandas as pd from sklearn.externals import joblib

    def get_prediction(data): df_data = pd.DataFrame([data]).astype(float) model = joblib.load("model.sav") predicted_amount = model.predict(df_data)[0] target_amount = data["target_amount"] if (target_amount > predicted_amount): is_funded = False else: is_funded = True prediction = {"amount": predicted_amount, "is_funded": is_funded} return json.dumps(prediction) model.py Making our model persistent
  40. from model import get_prediction ... @app.route('/result') def get_result(): ... data

    = json.loads(response.read()) prediction = json.loads(get_prediction(data)) return render_template("result.html", data=data, prediction=prediction) app.py Making our model persistent
  41. {{ data }} {{ prediction }} result.html Making our model

    persistent False is_funded':
  42. Some pitfalls...

  43. Some pitfalls • Make sure you're loading trusted data

  44. Some pitfalls • Make sure you're loading trusted data •

    Saving a model using a particular version of a library and loading it using another version might give unexpected results
  45. Some pitfalls • Make sure you're loading trusted data •

    Saving a model using a particular version of a library and loading it using another version might give unexpected results So what to do? Keep your: • Training data • Source code that generates the model • Version of the library used • Dependencies used • Cross validation score obtained
  46. Data visualization

  47. import matplotlib.pyplot as plt y = [1, 2, 3, 4,

    5] x = [0, 2, 1, 3, 4] plt.plot(x, y) Your original Python script/Jupyter Notebook Data visualization
  48. Data visualization import matplotlib.pyplot as plt import StringIO import base64

    import json def get_plot_url(fb_shares): y = [1, 2, 3, 4, fb_shares] x = [0, 2, 1, 3, 4] plt.plot(x, y) img = StringIO.StringIO() plt.savefig(img, format='png') img.seek(0) plot_url = base64.b64encode(img.getvalue()) return json.dumps({'plot_url': plot_url}) graph.py
  49. {{ data }} {{ prediction }} {{ graph }} result.html

    Data visualization
  50. Putting everything together

  51. <head> <title>Campaign Success Estimator</title> </head> <body> <h1>{{ data["id"] }}</h1> <h2>Statistics</h2>

    <ul> <li><strong>Story word count:</strong> {{ data["story_word_count"] }}</li> <li><strong>Number of images:</strong> {{ data["number_of_images"] }}</li> <li><strong>Number of videos:</strong> {{ data["number_of_videos"] }}</li> <li><strong>Number of Facebook shares:</strong> {{ data["number_of_fb_shares"] }}</li> <li><strong>Target amount:</strong> {{ data["target_amount"] }}</li> </ul> <h2>Prediction</h2> <strong>Predicted amount:</strong> {{ prediction["amount"] }} </body> templates/result.html Putting everything together
  52. Putting everything together

  53. <head> <title>Campaign Success Estimator</title> </head> <body> ... <img src="data:image/png;base64, {{

    graph['plot_url'] }}" /> </body> templates/result.html Putting everything together
  54. Putting everything together

  55. Flask's templating engine: Jinja2 <title>{% block title %}{% endblock %}</title>

    <ul> {% for user in users %} <li><a href="{{ user.url }}">{{ user.username }} </a></li> {% endfor %} </ul>
  56. Jinja2 examples jinja.pocoo.org/docs/2.10/

  57. Conditionals <head> <title>Campaign Success Estimator</title> <style> html { font-family: "Arial"

    } .prediction { margin: 10px; } .prediction .funded { color: #2ecc71; } .prediction .not-funded { color: #e74c3c; } </style> </head> ... templates/result.html
  58. Conditionals ... <div class="prediction"> <strong>Predicted amount: </strong> <span class="prediction {{'funded'

    if prediction['is_funded'] else 'not-funded'}}"> {{ prediction["amount"] }} </span> </div> ... templates/result.html <span class="prediction funded"> if is_predicted returns True <span class="prediction not-funded"> if is_predicted returns Frue
  59. Conditionals

  60. Conditionals

  61. Custom filters ... @app.template_filter('format_currency') def format_currency(value): value = int(value) return

    "Rp{:,}".format(value) ... app.py
  62. Custom filters ... <li><strong>Target amount:</strong> {{ data["target_amount"]| format_currency }}</li> </ul>

    <h2>Prediction</h2> <div class="prediction"> <strong>Predicted amount: </strong> <span class="prediction {{'funded' if prediction['is_predicted'] else 'not-funded'}}"> {{ prediction["amount"]|format_currency }} </span> </div> ... templates/result.html
  63. Custom filters

  64. What's next? • Deploy it and share it with the

    world: • Heroku • Google App Engine • And many other options • Add some more functionalities: • Flask Admin • Flask Login • ... and so on
  65. What's next? • Make it more interactive • react-flask •

    react-redux-flask • flask-vuejs • flask + d3.js • Make it more scalable
  66. Examples

  67. Flask Source code: https://github.com/galuhsahid/campaign-success-predictor Paper: https://ieeexplore.ieee.org/document/8355046/ Campaign Success Predictor

  68. Campaign Success Predictor Flask Paper: https://ieeexplore.ieee.org/document/8355046/ Source code: https://github.com/galuhsahid/campaign-success-predictor

  69. Indonesian Word Embedding (http://indonesian-word-embedding.herokuapp.com) Flask + Vue.js Source code: https://github.com/galuhsahid/indonesian-word-embedding

  70. Resources • Flask documentation • Jinja2 documentation

  71. That's it. Thanks!