Predicting irregularities in public bidding: an application of neural networks

Predicting irregularities in public bidding: an application of neural networks

51535f9347da7e003f77d31c4e0b5cec?s=128

Thiago Marzagão

May 28, 2017
Tweet

Transcript

  1. Predicting irregularities in public bidding: an application of neural networks

    Observatory of Public Spending
  2. Government contractor doesn’t pay employees Default epidemy in the federal

    government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
  3. Observatory of Public Spending what if we could predict which

    contractors will become headaches?
  4. Observatory of Public Spending

  5. Observatory of Public Spending impossible to do manually ~25k new

    contracts every year
  6. Observatory of Public Spending

  7. Observatory of Public Spending data + neural networks = predictions

  8. Observatory of Public Spending data: - n = 10186 -

    9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
  9. Observatory of Public Spending data: - Y: has the company

    been punished before?
  10. Observatory of Public Spending data: - X: a total of

    183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
  11. Observatory of Public Spending neural networks: - two approaches: -

    (“traditional”) neural network - deep neural network
  12. Observatory of Public Spending TNN: - 2 hidden layers -

    can’t handle 183 attributes - hence must use PCA first
  13. Observatory of Public Spending TNN: - PCA - selected 24

    continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
  14. Observatory of Public Spending TNN: - 9 components + 21

    binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
  15. Observatory of Public Spending DNN: - 3 hidden layers -

    hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
  16. Observatory of Public Spending DNN: - all 183 variables (no

    PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
  17. Observatory of Public Spending

  18. Observatory of Public Spending how can we evaluate performance? -

    accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
  19. Observatory of Public Spending how can we evaluate performance? -

    accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
  20. Observatory of Public Spending results: - TNN precision: 0.24 -

    DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
  21. Observatory of Public Spending to do: - improve recall -

    0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
  22. Observatory of Public Spending Ting Sun tsun9920@gmail.com Leonardo Sales leonado.sales@cgu.gov.br

  23. Observatory of Public Spending @tmarzagao thiagomarzagao.com