Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.2k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
890
SICSS presentation
thiagomarzagao
0
860
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.8k
mineração de dados
thiagomarzagao
0
2.4k
mineração de dados no governo
thiagomarzagao
1
3k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
260
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3k
Mineração de Dados no Governo Federal
thiagomarzagao
0
120
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
75
Other Decks in Research
See All in Research
Weekly AI Agents News! 8月号 プロダクト/ニュースのアーカイブ
masatoto
1
160
marukotenant01/tenant-20240826
marketing2024
0
500
LLM時代の半導体・集積回路
kentaroy47
1
430
12
0325
0
100
marukotenant01/tenant-20240916
marketing2024
0
250
DiscordにおけるキャラクターIPを活用したUGCコンテンツ生成サービスの ラピッドプロトタイピング ~国際ハッカソンでの事例研究
o_ob
0
230
Online Nonstationary and Nonlinear Bandits with Recursive Weighted Gaussian Process
monochromegane
0
220
尺度開発における質的研究アプローチ(自主企画シンポジウム7:認知行動療法における尺度開発のこれから)
litalicolab
0
290
SSII2024 [OS1] 現場の課題を解決する ロボットラーニング
ssii
PRO
0
560
第 2 部 11 章「大規模言語モデルの研究開発から実運用に向けて」に向けて / MLOps Book Chapter 11
upura
0
290
機械学習による言語パフォーマンスの評価
langstat
5
460
湯村研究室の紹介2024 / yumulab2024
yumulab
0
170
Featured
See All Featured
The Cult of Friendly URLs
andyhume
77
6k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
2
310
Embracing the Ebb and Flow
colly
84
4.4k
GraphQLとの向き合い方2022年版
quramy
43
13k
Raft: Consensus for Rubyists
vanstee
136
6.6k
Large-scale JavaScript Application Architecture
addyosmani
510
110k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
12
1.6k
Agile that works and the tools we love
rasmusluckow
327
21k
Docker and Python
trallard
40
3k
KATA
mclloyd
28
13k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
6.9k
Intergalactic Javascript Robots from Outer Space
tanoku
268
27k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com