Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predicting irregularities in public bidding: an application of neural networks

Predicting irregularities in public bidding: an application of neural networks

Thiago Marzagão

May 28, 2017
Tweet

More Decks by Thiago Marzagão

Other Decks in Research

Transcript

  1. Predicting irregularities in public bidding:
    an application of neural networks
    Observatory of
    Public Spending

    View Slide

  2. Government contractor
    doesn’t pay employees
    Default epidemy in the federal
    government: 4 companies went
    bankrupt
    Construction company abandons 3
    projects
    Observatory of
    Public Spending

    View Slide

  3. Observatory of
    Public Spending
    what if we could predict which
    contractors will become
    headaches?

    View Slide

  4. Observatory of
    Public Spending

    View Slide

  5. Observatory of
    Public Spending
    impossible to do manually
    ~25k new contracts every year

    View Slide

  6. Observatory of
    Public Spending

    View Slide

  7. Observatory of
    Public Spending
    data + neural networks = predictions

    View Slide

  8. Observatory of
    Public Spending
    data:
    - n = 10186
    - 9442 (~93%) not problem
    - 744 (~ 7%) problem
    - 2011-2016

    View Slide

  9. Observatory of
    Public Spending
    data:
    - Y: has the company been
    punished before?

    View Slide

  10. Observatory of
    Public Spending
    data:
    - X: a total of 183 attributes, like:
    - # of employees
    - average salary of employees
    - # of auctions it participated
    - donated $ to politicians?
    - …

    View Slide

  11. Observatory of
    Public Spending
    neural networks:
    - two approaches:
    - (“traditional”) neural network
    - deep neural network

    View Slide

  12. Observatory of
    Public Spending
    TNN:
    - 2 hidden layers
    - can’t handle 183 attributes
    - hence must use PCA first

    View Slide

  13. Observatory of
    Public Spending
    TNN:
    - PCA
    - selected 24 continuous variables
    based on covariance matrix
    - PCA reduced 24 variables to 9
    components (~70% of variance;
    all components w/ eigenvalue > 1)

    View Slide

  14. Observatory of
    Public Spending
    TNN:
    - 9 components + 21 binary vars.
    - 80% training
    - w/ oversampling
    - 20% testing
    - boosting (10 models)

    View Slide

  15. Observatory of
    Public Spending
    DNN:
    - 3 hidden layers
    - hundreds of neurons
    - can handle all 183 variables
    - can handle complex
    relationships between the
    variables

    View Slide

  16. Observatory of
    Public Spending
    DNN:
    - all 183 variables (no PCA)
    - no oversampling
    - 80% training
    - 20% testing
    - 5-fold cross-validation

    View Slide

  17. Observatory of
    Public Spending

    View Slide

  18. Observatory of
    Public Spending
    how can we evaluate performance?
    - accuracy (% of correct predictions
    overall)
    - recall (% of problems predicted to
    be problems)
    - precision (% of predicted
    problems that are problems)

    View Slide

  19. Observatory of
    Public Spending
    how can we evaluate performance?
    - accuracy (% of correct predictions
    overall)
    - recall (% of problems predicted to
    be problems)
    - precision (% of predicted
    problems that are problems)

    View Slide

  20. Observatory of
    Public Spending
    results:
    - TNN precision: 0.24
    - DNN precision: 0.79
    - huge difference! extra
    computational cost of DNN is worth
    it

    View Slide

  21. Observatory of
    Public Spending
    to do:
    - improve recall
    - 0.58 w/ TNN
    - 0.26 w/ DNN
    - change the law
    - must allow gov not to contract
    w/ high risk companies

    View Slide

  22. Observatory of
    Public Spending
    Ting Sun
    [email protected]
    Leonardo Sales
    [email protected]

    View Slide

  23. Observatory of
    Public Spending
    @tmarzagao
    thiagomarzagao.com

    View Slide