Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.4k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1.1k
SICSS presentation
thiagomarzagao
0
1k
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.9k
mineração de dados
thiagomarzagao
0
2.6k
mineração de dados no governo
thiagomarzagao
1
3.3k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
310
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.2k
Mineração de Dados no Governo Federal
thiagomarzagao
0
130
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
96
Other Decks in Research
See All in Research
Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification
satai
3
340
[論文紹介] Intuitive Fine-Tuning
ryou0634
0
160
HoliTracer:Holistic Vectorization of Geographic Objects from Large-Size Remote Sensing Imagery
satai
3
370
まずはここから:Overleaf共同執筆・CopilotでAIコーディング入門・Codespacesで独立環境
matsui_528
3
950
「リアル×スキマ時間」を活用したUXリサーチ 〜新規事業を前に進めるためのUXリサーチプロセスの設計〜
techtekt
PRO
0
210
J-RAGBench: 日本語RAGにおける Generator評価ベンチマークの構築
koki_itai
0
1.1k
Unsupervised Domain Adaptation Architecture Search with Self-Training for Land Cover Mapping
satai
3
430
SNLP2025:Can Language Models Reason about Individualistic Human Values and Preferences?
yukizenimoto
0
230
snlp2025_prevent_llm_spikes
takase
0
420
[IBIS 2025] 深層基盤モデルのための強化学習驚きから理論にもとづく納得へ
akifumi_wachi
19
9k
説明可能な機械学習と数理最適化
kelicht
2
770
生成AI による論文執筆サポート・ワークショップ ─ サーベイ/リサーチクエスチョン編 / Workshop on AI-Assisted Paper Writing Support: Survey/Research Question Edition
ks91
PRO
0
130
Featured
See All Featured
Rails Girls Zürich Keynote
gr2m
95
14k
[RailsConf 2023] Rails as a piece of cake
palkan
58
6.2k
ラッコキーワード サービス紹介資料
rakko
0
1.8M
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
330
Making the Leap to Tech Lead
cromwellryan
135
9.7k
Context Engineering - Making Every Token Count
addyosmani
9
550
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
290
KATA
mclloyd
PRO
33
15k
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
0
3.4k
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
0
22
WCS-LA-2024
lcolladotor
0
390
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com