Predicting irregularities in public bidding:
an application of neural networks
Observatory of
Public Spending
Slide 2
Slide 2 text
Government contractor
doesn’t pay employees
Default epidemy in the federal
government: 4 companies went
bankrupt
Construction company abandons 3
projects
Observatory of
Public Spending
Slide 3
Slide 3 text
Observatory of
Public Spending
what if we could predict which
contractors will become
headaches?
Slide 4
Slide 4 text
Observatory of
Public Spending
Slide 5
Slide 5 text
Observatory of
Public Spending
impossible to do manually
~25k new contracts every year
Slide 6
Slide 6 text
Observatory of
Public Spending
Slide 7
Slide 7 text
Observatory of
Public Spending
data + neural networks = predictions
Slide 8
Slide 8 text
Observatory of
Public Spending
data:
- n = 10186
- 9442 (~93%) not problem
- 744 (~ 7%) problem
- 2011-2016
Slide 9
Slide 9 text
Observatory of
Public Spending
data:
- Y: has the company been
punished before?
Slide 10
Slide 10 text
Observatory of
Public Spending
data:
- X: a total of 183 attributes, like:
- # of employees
- average salary of employees
- # of auctions it participated
- donated $ to politicians?
- …
Slide 11
Slide 11 text
Observatory of
Public Spending
neural networks:
- two approaches:
- (“traditional”) neural network
- deep neural network
Slide 12
Slide 12 text
Observatory of
Public Spending
TNN:
- 2 hidden layers
- can’t handle 183 attributes
- hence must use PCA first
Slide 13
Slide 13 text
Observatory of
Public Spending
TNN:
- PCA
- selected 24 continuous variables
based on covariance matrix
- PCA reduced 24 variables to 9
components (~70% of variance;
all components w/ eigenvalue > 1)
Slide 14
Slide 14 text
Observatory of
Public Spending
TNN:
- 9 components + 21 binary vars.
- 80% training
- w/ oversampling
- 20% testing
- boosting (10 models)
Slide 15
Slide 15 text
Observatory of
Public Spending
DNN:
- 3 hidden layers
- hundreds of neurons
- can handle all 183 variables
- can handle complex
relationships between the
variables
Slide 16
Slide 16 text
Observatory of
Public Spending
DNN:
- all 183 variables (no PCA)
- no oversampling
- 80% training
- 20% testing
- 5-fold cross-validation
Slide 17
Slide 17 text
Observatory of
Public Spending
Slide 18
Slide 18 text
Observatory of
Public Spending
how can we evaluate performance?
- accuracy (% of correct predictions
overall)
- recall (% of problems predicted to
be problems)
- precision (% of predicted
problems that are problems)
Slide 19
Slide 19 text
Observatory of
Public Spending
how can we evaluate performance?
- accuracy (% of correct predictions
overall)
- recall (% of problems predicted to
be problems)
- precision (% of predicted
problems that are problems)
Slide 20
Slide 20 text
Observatory of
Public Spending
results:
- TNN precision: 0.24
- DNN precision: 0.79
- huge difference! extra
computational cost of DNN is worth
it
Slide 21
Slide 21 text
Observatory of
Public Spending
to do:
- improve recall
- 0.58 w/ TNN
- 0.26 w/ DNN
- change the law
- must allow gov not to contract
w/ high risk companies