Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.4k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1.1k
SICSS presentation
thiagomarzagao
0
1k
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.9k
mineração de dados
thiagomarzagao
0
2.6k
mineração de dados no governo
thiagomarzagao
1
3.3k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
300
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.2k
Mineração de Dados no Governo Federal
thiagomarzagao
0
130
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
95
Other Decks in Research
See All in Research
令和最新技術で伝統掲示板を再構築: HonoX で作る型安全なスレッドフロート型掲示板 / かろっく@calloc134 - Hono Conference 2025
calloc134
0
420
心理言語学の視点から再考する言語モデルの学習過程
chemical_tree
2
690
言語モデルの地図:確率分布と情報幾何による類似性の可視化
shimosan
8
2.1k
財務諸表監査のための逐次検定
masakat0
0
200
Integrating Static Optimization and Dynamic Nature in JavaScript (GPCE 2025)
tadd
0
150
When Learned Data Structures Meet Computer Vision
matsui_528
1
690
SNLP2025:Can Language Models Reason about Individualistic Human Values and Preferences?
yukizenimoto
0
210
Open Gateway 5GC利用への期待と不安
stellarcraft
2
160
[IBIS 2025] 深層基盤モデルのための強化学習驚きから理論にもとづく納得へ
akifumi_wachi
13
6.9k
ウェブ・ソーシャルメディア論文読み会 第31回: The rising entropy of English in the attention economy. (Commun Psychology, 2024)
hkefka385
1
120
MIRU2025 チュートリアル講演「ロボット基盤モデルの最前線」
haraduka
15
10k
機械学習と数理最適化の融合 (MOAI) による革新
mickey_kubo
1
420
Featured
See All Featured
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
350
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.3k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
34
2.3k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Balancing Empowerment & Direction
lara
5
760
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.8k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
680
Visualization
eitanlees
150
16k
Rebuilding a faster, lazier Slack
samanthasiow
84
9.3k
Embracing the Ebb and Flow
colly
88
4.9k
Faster Mobile Websites
deanohume
310
31k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com