Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.4k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1.1k
SICSS presentation
thiagomarzagao
0
1.1k
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
2k
mineração de dados
thiagomarzagao
0
2.7k
mineração de dados no governo
thiagomarzagao
1
3.3k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
310
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.3k
Mineração de Dados no Governo Federal
thiagomarzagao
0
140
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
97
Other Decks in Research
See All in Research
2026年1月の生成AI領域の重要リリース&トピック解説
kajikent
0
810
CyberAgent AI Lab研修 / Social Implementation Anti-Patterns in AI Lab
chck
6
4k
20251023_くまもと21の会例会_「車1割削減、渋滞半減、公共交通2倍」をめざして.pdf
trafficbrain
0
190
2026年3月1日(日)福島「除染土」の公共利用をかんがえる
atsukomasano2026
0
440
学習型データ構造:機械学習を内包する新しいデータ構造の設計と解析
matsui_528
6
3.8k
FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing
satai
3
210
[IBIS 2025] 深層基盤モデルのための強化学習驚きから理論にもとづく納得へ
akifumi_wachi
20
9.8k
大規模言語モデルにおけるData-Centric AIと合成データの活用 / Data-Centric AI and Synthetic Data in Large Language Models
tsurubee
1
520
離散凸解析に基づく予測付き離散最適化手法 (IBIS '25)
taihei_oki
1
720
衛星×エッジAI勉強会 衛星上におけるAI処理制約とそ取組について
satai
4
320
An Open and Reproducible Deep Research Agent for Long-Form Question Answering
ikuyamada
0
340
2026 東京科学大 情報通信系 研究室紹介 (大岡山)
icttitech
0
690
Featured
See All Featured
A Tale of Four Properties
chriscoyier
163
24k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
How to Ace a Technical Interview
jacobian
281
24k
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
140
Building Adaptive Systems
keathley
44
3k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
780
Context Engineering - Making Every Token Count
addyosmani
9
740
Building Applications with DynamoDB
mza
96
7k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
14k
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
3.1k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com