Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.3k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1k
SICSS presentation
thiagomarzagao
0
1k
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.9k
mineração de dados
thiagomarzagao
0
2.6k
mineração de dados no governo
thiagomarzagao
1
3.2k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
300
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.2k
Mineração de Dados no Governo Federal
thiagomarzagao
0
130
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
90
Other Decks in Research
See All in Research
診断前の病歴テキストを対象としたLLMによるエンティティリンキング精度検証
hagino3000
1
130
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
kurita
0
170
GPUを利用したStein Particle Filterによる点群6自由度モンテカルロSLAM
takuminakao
0
250
投資戦略202508
pw
0
560
とあるSREの博士「過程」 / A Certain SRE’s Ph.D. Journey
yuukit
10
4.2k
Combinatorial Search with Generators
kei18
0
750
論文読み会 SNLP2025 Learning Dynamics of LLM Finetuning. In: ICLR 2025
s_mizuki_nlp
0
200
Delta Airlines® Customer Care in the U.S.: How to Reach Them Now
bookingcomcustomersupportusa
0
110
AI エージェントを活用した研究再現性の自動定量評価 / scisci2025
upura
1
150
時系列データに対する解釈可能な 決定木クラスタリング
mickey_kubo
2
930
20250725-bet-ai-day
cipepser
2
420
不確実性下における目的と手段の統合的探索に向けた連続腕バンディットの応用 / iot70_gp_rff_mab
monochromegane
2
160
Featured
See All Featured
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
31
2.2k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.4k
How to train your dragon (web standard)
notwaldorf
96
6.2k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
Six Lessons from altMBA
skipperchong
28
4k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Unsuck your backbone
ammeep
671
58k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
61k
The Straight Up "How To Draw Better" Workshop
denniskardys
236
140k
Context Engineering - Making Every Token Count
addyosmani
1
30
It's Worth the Effort
3n
187
28k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
13k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com