Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.4k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1.1k
SICSS presentation
thiagomarzagao
0
1.1k
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
2k
mineração de dados
thiagomarzagao
0
2.6k
mineração de dados no governo
thiagomarzagao
1
3.3k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
310
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.2k
Mineração de Dados no Governo Federal
thiagomarzagao
0
130
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
96
Other Decks in Research
See All in Research
SREのためのテレメトリー技術の探究 / Telemetry for SRE
yuukit
13
2.9k
スキマバイトサービスにおける現場起点でのデザインアプローチ
yoshioshingyouji
0
280
データサイエンティストをめぐる環境の違い2025年版〈一般ビジネスパーソン調査の国際比較〉
datascientistsociety
PRO
0
640
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
satai
2
430
湯村研究室の紹介2025 / yumulab2025
yumulab
0
290
Akamaiのキャッシュ効率を支えるAdaptSizeについての論文を読んでみた
bootjp
1
420
空間音響処理における物理法則に基づく機械学習
skoyamalab
0
190
[Devfest Incheon 2025] 모두를 위한 친절한 언어모델(LLM) 학습 가이드
beomi
2
1.4k
Stealing LUKS Keys via TPM and UUID Spoofing in 10 Minutes - BSides 2025
anykeyshik
0
180
その推薦システムの評価指標、ユーザーの感覚とズレてるかも
kuri8ive
1
310
[チュートリアル] 電波マップ構築入門 :研究動向と課題設定の勘所
k_sato
0
210
2025-11-21-DA-10th-satellite
yegusa
0
110
Featured
See All Featured
コードの90%をAIが書く世界で何が待っているのか / What awaits us in a world where 90% of the code is written by AI
rkaga
59
42k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
14k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.1k
Designing for humans not robots
tammielis
254
26k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
150
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
Getting science done with accelerated Python computing platforms
jacobtomlinson
1
110
Leading Effective Engineering Teams in the AI Era
addyosmani
9
1.5k
Paper Plane
katiecoart
PRO
0
46k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com