Predicting irregularities in public bidding: an application of neural networks

Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending

Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending

Observatory of Public Spending what if we could predict which
contractors will become headaches?

Observatory of Public Spending impossible to do manually ~25k new
contracts every year

Observatory of Public Spending data + neural networks = predictions

Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016

Observatory of Public Spending data: - Y: has the company
been punished before?

Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …

Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network

Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first

Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)

Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)

Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables

Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation

Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)

Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it

Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies

Observatory of Public Spending Ting Sun [email protected] Leonardo Sales [email protected]

Observatory of Public Spending @tmarzagao thiagomarzagao.com

Predicting irregularities in public bidding: an...

Predicting irregularities in public bidding: an application of neural networks

Thiago Marzagão

More Decks by Thiago Marzagão

Other Decks in Research

Featured

Transcript