Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predicting irregularities in public bidding: an application of neural networks

Predicting irregularities in public bidding: an application of neural networks

Thiago Marzagão

May 28, 2017
Tweet

More Decks by Thiago Marzagão

Other Decks in Research

Transcript

  1. Government contractor doesn’t pay employees Default epidemy in the federal

    government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
  2. Observatory of Public Spending data: - n = 10186 -

    9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
  3. Observatory of Public Spending data: - X: a total of

    183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
  4. Observatory of Public Spending neural networks: - two approaches: -

    (“traditional”) neural network - deep neural network
  5. Observatory of Public Spending TNN: - 2 hidden layers -

    can’t handle 183 attributes - hence must use PCA first
  6. Observatory of Public Spending TNN: - PCA - selected 24

    continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
  7. Observatory of Public Spending TNN: - 9 components + 21

    binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
  8. Observatory of Public Spending DNN: - 3 hidden layers -

    hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
  9. Observatory of Public Spending DNN: - all 183 variables (no

    PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
  10. Observatory of Public Spending how can we evaluate performance? -

    accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
  11. Observatory of Public Spending how can we evaluate performance? -

    accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
  12. Observatory of Public Spending results: - TNN precision: 0.24 -

    DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
  13. Observatory of Public Spending to do: - improve recall -

    0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies