Open Source in Real Life

Open Source in Real Life


Ana Schwendler

September 24, 2019


  1. 3.

    WHAT IS SERENATA? • The main goal: use artificial intelligence

    to social control of public administration • We learnt how to work with data science using open data (CSVs that show reimbursements). • Multidisciplinary team: Scientists, programers, marketing and journalists • Open Source: More than 700 members in the Telegram group.
  2. 4.

    WHY? • Advantages: Bringing citizens and government closer, suggesting technology

    solutions • For the developer: tool choice flexibility
  3. 5.

    • We did a crowdfunding campaign that would pay 3

    months of development • Data science projects usually take 6 months to a year, what can we do in 3 months? • Techniques: hypothesis driven development and timeboxing HOW DO WE GET HERE?
  4. 7.

    • Hypothesis-Driven Development • Survey of hypotheses that seek the

    solution of a problem • Multidisciplinary team as a way to expand knowledge HDD: HYPOTHESES
  5. 8.

    • List of hypotheses to explore • Associate a time

    window with development, and if it doesn't work, switch to another hypothesis • Back to previous assumptions as time goes by TIMEBOXING
  6. 9.

    • We studied the available dataset, and by that we

    defined some hypothesis we could have: ◦ Non-Standard Prices on Food ◦ Traveled distance and spending ◦ Invalid tax identification number ◦ Monthly maximums (taxi, fuel, ...) DEVELOPED HYPOTHESES
  7. 10.

    • Jupyter notebook with initial analysis • Script for parsing

    the entire database • Training an initial model • Retraining after time period DEVELOPMENT CYCLE
  8. 13.
  9. 14.