Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Thiago Marzagão
May 28, 2017
Research
3.5k
0
Share
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1.2k
SICSS presentation
thiagomarzagao
0
1.1k
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
2k
mineração de dados
thiagomarzagao
0
2.7k
mineração de dados no governo
thiagomarzagao
1
3.4k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
320
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.3k
Mineração de Dados no Governo Federal
thiagomarzagao
0
140
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
98
Other Decks in Research
See All in Research
AGI4OPT:自然言語から数理最適化を導くエ ージェントスキル Translating Human Intent into Mathematical Optimization
mickey_kubo
0
100
2026-01-30-MandSL-textbook-jp-cos-lod
yegusa
1
1.1k
「なんとなく」の顧客理解から脱却する ──顧客の解像度を武器にするインサイトマネジメント
tajima_kaho
10
7.5k
非試合日の野球場を楽しむためのARホームランボールキャッチ体験システムの開発 / EC79-miyazaki
yumulab
0
170
「行ける・行けない表」による地域公共交通の性能評価
bansousha
0
150
Ankylosing Spondylitis
ankh2054
0
170
Φ-Sat-2のAutoEncoderによる情報圧縮系論文
satai
4
570
AI Agentの精度改善に見るML開発との共通点 / commonalities in accuracy improvements in agentic era
shimacos
6
1.6k
製造業主導型経済からサービス経済化における中間層形成メカニズムのパラダイムシフト
yamotty
0
570
Model Discovery and Graph Simulation: A Lightweight Gateway to Chaos Engineering
anatolykr
0
150
AIを叩き台として、 「検証」から「共創」へと進化するリサーチ
mela_dayo
0
250
R&Dチームを起ち上げる
shibuiwilliam
1
240
Featured
See All Featured
How to build a perfect <img>
jonoalderson
1
5.5k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
9k
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
130
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
140
Git: the NoSQL Database
bkeepers
PRO
432
67k
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
190
Deep Space Network (abreviated)
tonyrice
0
130
Information Architects: The Missing Link in Design Systems
soysaucechin
0
920
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
2
190
Embracing the Ebb and Flow
colly
88
5k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
540
A Soul's Torment
seathinner
6
2.8k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com