Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Challenges developing meuParlamento.pt

Challenges developing meuParlamento.pt

Presented at HostelWorld

This presentation describes motivation and challenges developing the project.

Projeto meuParlamento.pt

July 19, 2019
Tweet

More Decks by Projeto meuParlamento.pt

Other Decks in Programming

Transcript

  1. “ Nuno Moniz INESC TEC / Universidade do Porto
 Arian

    Pasquali INESC TEC
 Tomás Amaro HostelWorld 2
  2. 4 Primeira eleição Última eleição In Portugal, abstention levels in

    legislative elections went from 8% in 1975 to 44% in 2015
  3. 5 During election periods we observe the rise of quizzes

    to Jnd out more about your political orientation. They might be useful in some cases, but there are some serious issues related to both Bias and Privacy Quizzers
  4. 7 • Increase engagement of citizens in politics
 • Explore

    new ways to interact with the Parliament • Allow citizens to simulate the Parliament in their smartphones
 • Encourage healthy political debate
 • Keep its usage anonymous
 Our goals
  5. “ 1. 
 Privacy and anonymity are the most important

    value. No registration needed. We do not store and do not share how users vote. 10
  6. “ 3. 
 Always open source and open data We

    encourage users to contribute 12
  7. “ 4. 
 Always beta Fork it and contribute. 


    Always open for contributions. 13
  8. 14

  9. 15 Vote real legislative proposals See which political parties vote

    like you Explore the biographies of MPs Share results with your friends meuParlamento.pt
  10. How it works With three simple gestures Abstention In favor

    Against Swipe left if you are against Swipe right if you aprove Swipe up Users can skip if they prefer not to vote
  11. 17 • Which political parties voted the same way you

    did;
 • Detailed information about each proposal:
 • Original full-text pdf; 
 • MPs biographies;
 • Related news;
 • Share results with your friends. Results After voting 10 proposals you get to know:
  12. 20 Privacy requirements The system should not store any information

    about how users vote; No registration or sign-up needed; Anonymous usage; Open source project.
  13. 21 HTTP API arquivo.pt parlamento.pt Data processing pipeline Luigi as

    pipeline framework AWS Chalice React Native MongoDB Storage About 3.000 proposals Web scrapping Refinement Text summarization Etc Endpoint: 
 Provide 10 random proposals to vote Overview
  14. 23 • Most of the logic was under App.js
 •

    3 Components • Modals to show extra info like “News” and “Authors” • No navigation Mobile APP - Version 1
 Big mess with massive chunks of code
  15. 24

  16. 25

  17. 26 • Separation into smaller components • Navigation implemented -

    React Navigation • Screen/Component composition • Notification Service • Updated version of React-Native • New proposal button • Extra context on proposal cards • A lot of bug fixing […a lot] Mobile APP - Version 2
 The beta almost release candidate
  18. 33 Developing mobile UI
 Cross platform development using React-Native •

    Support Android and iOS simultaneously React
 Native
  19. 35 Exploratory data analysis
 Imbalanced distributions • We want to

    provide a fair random method. 
 It is not as easy as it sounds: • Number of proposals by political parties are imbalanced • We need to take into account majority and opposition in the Parliament for each proposal debate. Proposals aggregated by political party
  20. 37 • Tasks • Web scrapping proposal from parlamento.pt website

    • Download proposal’s pdf • Parse pdf extracting textual content • Summarize pdf content • Analyze text readability • Index into MongoDB collection
 • Why Luigi *? • Manages tasks dependences • Task execution monitor
 
 Data processing pipeline
 From plain python scripts to task management * https://github.com/spotify/luigi
  21. 41 Backend requirements • Provide these simple endpoints 1. Random

    proposals 2. Proposals’ related news 3. Proposals’ authors
 • Easy to deploy • Monitoring • Cheap to scale meuParlamento API HTTP API JSON
  22. 42 • Our first version was a prototype with a

    simple json file on a remote server • Quick and dirty • Update using ftp • All logic in the client UI Backend API - Version 1
 plain json file
  23. 43 Backend API - Version 2
 Flask-based web API •

    Flask as python web framework • Data manipulation using Pandas • Load json file in memory • Use DataFrame data structure • In-memory SQL support • Support sampling select
 • Deployment on Heroku • Pros • Easy to deploy using GitHub integration • Cons • Free tier wasn’t reliable for production • Too expensive to scale

  24. 44 Backend API - Version 2.1
 Flask-based api • Flask

    as python web framework • Data manipulation using Pandas • MongoDB as database • Code deployment on Heroku • Pros • It is a database :) • Better query support • Cons • Bottleneck. Heroku’s free tier • Too expensive to scale for our pockets
  25. 45 • From Flask to serverless framework AWS Chalice •

    Pros • Minimal infrastructure management; • Load balance and fault tolerance by default; • Cheaper to scale if necessary; • Easier to deploy; • Support different stages (e.g. dev, test, prod, …) • Support api versioning • Better monitoring and metrics • Advanced log analytics • Alerts (e.g. define latency threshold) • Cons • Bottleneck is now MongoDB (up to 200 simultaneous connections) Backend API - Version 3
 Serverless architecture AWS Lambda
  26. 49 • Bottleneck is now MongoDB. • Scale cloud mongodb

    can be expensive.
 • Back to pandas ? • Idea • Pipeline • Save results at Amazon S3 bucket • Backend • Load file from S3 bucket using pandas Open question
 Remove mongodb and go back to pandas? AWS Lambda
  27. 50 • Search related news using arquivo.pt • Store results

    into an Apache Solr index • Free text search • The problem • Recall versus precision • Hard to find precise results • Solution • Provide links for the main news outlets Backend API - one last thing
 Related news
  28. 51 • Support session debates • Support new proposals in

    real-time • Support different public institutions (e.g. European Parliament, city council, etc) • Support other countries Next steps