Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.2k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
950
SICSS presentation
thiagomarzagao
0
920
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.8k
mineração de dados
thiagomarzagao
0
2.5k
mineração de dados no governo
thiagomarzagao
1
3.1k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
270
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.1k
Mineração de Dados no Governo Federal
thiagomarzagao
0
120
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
79
Other Decks in Research
See All in Research
eAI (Engineerable AI) プロジェクトの全体像 / Overview of eAI Project
ishikawafyu
0
370
Neural Fieldの紹介
nnchiba
2
670
メタヒューリスティクスに基づく汎用線形整数計画ソルバーの開発
snowberryfield
3
760
複数データセットを用いた動作認識
yuyay
0
110
Poster: Feasibility of Runtime-Neutral Wasm Instrumentation for Edge-Cloud Workload Handover
chikuwait
0
340
新規のC言語処理系を実装することによる 組込みシステム研究にもたらす価値 についての考察
zacky1972
1
320
Retrieval of Hurricane Rain Rate From SAR Images Based on Artificial Neural Network
satai
2
140
The many faces of AI and the role of mathematics
gpeyre
1
1.6k
Satellite Sunroof: High-res Digital Surface Models and Roof Segmentation for Global Solar Mapping
satai
2
130
【NLPコロキウム】Stepwise Alignment for Constrained Language Model Policy Optimization (NeurIPS 2024)
akifumi_wachi
3
520
Leveraging LLMs for Unsupervised Dense Retriever Ranking (SIGIR 2024)
kampersanda
2
300
[ECCV2024読み会] 衛星画像からの地上画像生成
elith
1
1.1k
Featured
See All Featured
How to train your dragon (web standard)
notwaldorf
91
5.8k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
Large-scale JavaScript Application Architecture
addyosmani
511
110k
The Art of Programming - Codeland 2020
erikaheidi
53
13k
Making the Leap to Tech Lead
cromwellryan
133
9.1k
The Invisible Side of Design
smashingmag
299
50k
Embracing the Ebb and Flow
colly
84
4.6k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
46
2.3k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
Fireside Chat
paigeccino
34
3.2k
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
44
7k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com