Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Challenges developing meuParlamento.pt

Challenges developing meuParlamento.pt

Presented at HostelWorld

This presentation describes motivation and challenges developing the project.

F413bb11e916b5318bf87275305693e7?s=128

Projeto meuParlamento.pt

July 19, 2019
Tweet

Transcript

  1. * meuParlamento.pt Become a member of the Portuguese Parliament

  2. “ Nuno Moniz INESC TEC / Universidade do Porto
 Arian

    Pasquali INESC TEC
 Tomás Amaro HostelWorld 2
  3. 69.27% Abstention in the last elections in Portugal 3

  4. 4 Primeira eleição Última eleição In Portugal, abstention levels in

    legislative elections went from 8% in 1975 to 44% in 2015
  5. 5 During election periods we observe the rise of quizzes

    to Jnd out more about your political orientation. They might be useful in some cases, but there are some serious issues related to both Bias and Privacy Quizzers
  6. Motivation

  7. 7 • Increase engagement of citizens in politics
 • Explore

    new ways to interact with the Parliament • Allow citizens to simulate the Parliament in their smartphones
 • Encourage healthy political debate
 • Keep its usage anonymous
 Our goals
  8. 8 Our solution ▹ http://meuparlamento.pt

  9. Manifesto

  10. “ 1. 
 Privacy and anonymity are the most important

    value. No registration needed. We do not store and do not share how users vote. 10
  11. “ 2. 
 This is a non-profit project.
 The project

    will always be free 11
  12. “ 3. 
 Always open source and open data We

    encourage users to contribute 12
  13. “ 4. 
 Always beta Fork it and contribute. 


    Always open for contributions. 13
  14. 14

  15. 15 Vote real legislative proposals See which political parties vote

    like you Explore the biographies of MPs Share results with your friends meuParlamento.pt
  16. How it works With three simple gestures Abstention In favor

    Against Swipe left if you are against Swipe right if you aprove Swipe up Users can skip if they prefer not to vote
  17. 17 • Which political parties voted the same way you

    did;
 • Detailed information about each proposal:
 • Original full-text pdf; 
 • MPs biographies;
 • Related news;
 • Share results with your friends. Results After voting 10 proposals you get to know:
  18. Android Support Available at Google Play Store Place your screenshot

    here 18
  19. 19 iPhone Support Available at Apple Store.

  20. 20 Privacy requirements The system should not store any information

    about how users vote; No registration or sign-up needed; Anonymous usage; Open source project.
  21. 21 HTTP API arquivo.pt parlamento.pt Data processing pipeline Luigi as

    pipeline framework AWS Chalice React Native MongoDB Storage About 3.000 proposals Web scrapping Refinement Text summarization Etc Endpoint: 
 Provide 10 random proposals to vote Overview
  22. Challenges Mobile App

  23. 23 • Most of the logic was under App.js
 •

    3 Components • Modals to show extra info like “News” and “Authors” • No navigation Mobile APP - Version 1
 Big mess with massive chunks of code
  24. 24

  25. 25

  26. 26 • Separation into smaller components • Navigation implemented -

    React Navigation • Screen/Component composition • Notification Service • Updated version of React-Native • New proposal button • Extra context on proposal cards • A lot of bug fixing […a lot] Mobile APP - Version 2
 The beta almost release candidate
  27. 27 Live preview

  28. 28 PanResponder
 Mood destroyer

  29. 29 Proposal Context
 A “Curious” solution

  30. 30 Proposal Context
 A “Curious” solution

  31. 31 Proposal Context
 A “Curious” solution

  32. Challenges Mobile User Interface

  33. 33 Developing mobile UI
 Cross platform development using React-Native •

    Support Android and iOS simultaneously React
 Native
  34. Challenges Exploratory data analysis

  35. 35 Exploratory data analysis
 Imbalanced distributions • We want to

    provide a fair random method. 
 It is not as easy as it sounds: • Number of proposals by political parties are imbalanced • We need to take into account majority and opposition in the Parliament for each proposal debate. Proposals aggregated by political party
  36. Challenges Data processing pipeline

  37. 37 • Tasks • Web scrapping proposal from parlamento.pt website

    • Download proposal’s pdf • Parse pdf extracting textual content • Summarize pdf content • Analyze text readability • Index into MongoDB collection
 • Why Luigi *? • Manages tasks dependences • Task execution monitor
 
 Data processing pipeline
 From plain python scripts to task management * https://github.com/spotify/luigi
  38. 38 Data processing pipeline
 Luigi as task management engine

  39. 39 Data processing pipeline
 Luigi as task management engine.

  40. Challenges Backend api

  41. 41 Backend requirements • Provide these simple endpoints 1. Random

    proposals 2. Proposals’ related news 3. Proposals’ authors
 • Easy to deploy • Monitoring • Cheap to scale meuParlamento API HTTP API JSON
  42. 42 • Our first version was a prototype with a

    simple json file on a remote server • Quick and dirty • Update using ftp • All logic in the client UI Backend API - Version 1
 plain json file
  43. 43 Backend API - Version 2
 Flask-based web API •

    Flask as python web framework • Data manipulation using Pandas • Load json file in memory • Use DataFrame data structure • In-memory SQL support • Support sampling select
 • Deployment on Heroku • Pros • Easy to deploy using GitHub integration • Cons • Free tier wasn’t reliable for production • Too expensive to scale

  44. 44 Backend API - Version 2.1
 Flask-based api • Flask

    as python web framework • Data manipulation using Pandas • MongoDB as database • Code deployment on Heroku • Pros • It is a database :) • Better query support • Cons • Bottleneck. Heroku’s free tier • Too expensive to scale for our pockets
  45. 45 • From Flask to serverless framework AWS Chalice •

    Pros • Minimal infrastructure management; • Load balance and fault tolerance by default; • Cheaper to scale if necessary; • Easier to deploy; • Support different stages (e.g. dev, test, prod, …) • Support api versioning • Better monitoring and metrics • Advanced log analytics • Alerts (e.g. define latency threshold) • Cons • Bottleneck is now MongoDB (up to 200 simultaneous connections) Backend API - Version 3
 Serverless architecture AWS Lambda
  46. 46 Backend API using serverless framework

  47. 47 Easier DevOps with AWS Lambda Deploying using a single

    command
  48. 48 Example requesting random proposals GET https://xxxxxx.eu-west-1.amazonaws.com/test/proposals/batch/10

  49. 49 • Bottleneck is now MongoDB. • Scale cloud mongodb

    can be expensive.
 • Back to pandas ? • Idea • Pipeline • Save results at Amazon S3 bucket • Backend • Load file from S3 bucket using pandas Open question
 Remove mongodb and go back to pandas? AWS Lambda
  50. 50 • Search related news using arquivo.pt • Store results

    into an Apache Solr index • Free text search • The problem • Recall versus precision • Hard to find precise results • Solution • Provide links for the main news outlets Backend API - one last thing
 Related news
  51. 51 • Support session debates • Support new proposals in

    real-time • Support different public institutions (e.g. European Parliament, city council, etc) • Support other countries Next steps
  52. http://meuParlamento.pt Nuno Moniz Arian Pasquali Tomás Amaro Propostas votadas +10

    mil Pesquisas realizadas +650 Acessos +1000