$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.4k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1.1k
SICSS presentation
thiagomarzagao
0
1k
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.9k
mineração de dados
thiagomarzagao
0
2.6k
mineração de dados no governo
thiagomarzagao
1
3.3k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
300
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.2k
Mineração de Dados no Governo Federal
thiagomarzagao
0
130
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
95
Other Decks in Research
See All in Research
SREはサイバネティクスの夢をみるか? / Do SREs Dream of Cybernetics?
yuukit
2
250
EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues
satai
3
460
地域丸ごとデイサービス「Go トレ」の紹介
smartfukushilab1
0
700
論文紹介:Not All Tokens Are What You Need for Pretraining
kosuken
1
220
Time to Cash: The Full Stack Breakdown of Modern ATM Attacks
ratatata
0
180
Remote sensing × Multi-modal meta survey
satai
4
650
Open Gateway 5GC利用への期待と不安
stellarcraft
2
170
学習型データ構造:機械学習を内包する新しいデータ構造の設計と解析
matsui_528
5
2.2k
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
400
「リアル×スキマ時間」を活用したUXリサーチ 〜新規事業を前に進めるためのUXリサーチプロセスの設計〜
techtekt
PRO
0
200
長期・短期メモリを活用したエージェントの個別最適化
isidaitc
0
350
Can AI Generated Ambrotype Chain the Aura of Alternative Process? In SIGGRAPH Asia 2024 Art Papers
toremolo72
0
100
Featured
See All Featured
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
16
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
120
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
1
400
Mind Mapping
helmedeiros
PRO
0
38
Technical Leadership for Architectural Decision Making
baasie
0
180
A designer walks into a library…
pauljervisheath
210
24k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
How to Talk to Developers About Accessibility
jct
1
84
The Cost Of JavaScript in 2023
addyosmani
55
9.4k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
68
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com