Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.3k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1k
SICSS presentation
thiagomarzagao
0
990
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.9k
mineração de dados
thiagomarzagao
0
2.6k
mineração de dados no governo
thiagomarzagao
1
3.2k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
290
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.2k
Mineração de Dados no Governo Federal
thiagomarzagao
0
130
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
90
Other Decks in Research
See All in Research
Submeter-level land cover mapping of Japan
satai
3
180
Minimax and Bayes Optimal Best-arm Identification: Adaptive Experimental Design for Treatment Choice
masakat0
0
160
2021年度-基盤研究B-研究計画調書
trycycle
PRO
0
170
SSII2025 [TS3] 医工連携における画像情報学研究
ssii
PRO
2
1.2k
Vision And Languageモデルにおける異なるドメインでの継続事前学習が性能に与える影響の検証 / YANS2024
sansan_randd
1
130
RHO-1: Not All Tokens Are What You Need
sansan_randd
1
160
ノンパラメトリック分布表現を用いた位置尤度場周辺化によるRTK-GNSSの整数アンビギュイティ推定
aoki_nosse
0
350
A multimodal data fusion model for accurate and interpretable urban land use mapping with uncertainty analysis
satai
3
250
Trust No Bot? Forging Confidence in AI for Software Engineering
tomzimmermann
1
250
Google Agent Development Kit (ADK) 入門 🚀
mickey_kubo
2
1.4k
20250502_ABEJA_論文読み会_スライド
flatton
0
180
定性データ、どう活かす? 〜定性データのための分析基盤、はじめました〜 / How to utilize qualitative data? ~We have launched an analysis platform for qualitative data~
kaminashi
7
1.1k
Featured
See All Featured
Visualization
eitanlees
146
16k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
870
Why Our Code Smells
bkeepers
PRO
337
57k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
2.9k
Agile that works and the tools we love
rasmusluckow
329
21k
Fireside Chat
paigeccino
37
3.6k
Art, The Web, and Tiny UX
lynnandtonic
301
21k
How to Ace a Technical Interview
jacobian
278
23k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
4 Signs Your Business is Dying
shpigford
184
22k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com