Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.3k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
1k
SICSS presentation
thiagomarzagao
0
970
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.9k
mineração de dados
thiagomarzagao
0
2.6k
mineração de dados no governo
thiagomarzagao
1
3.2k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
290
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.1k
Mineração de Dados no Governo Federal
thiagomarzagao
0
130
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
88
Other Decks in Research
See All in Research
Computational OT #4 - Gradient flow and diffusion models
gpeyre
0
310
SSII2025 [SS1] レンズレスカメラ
ssii
PRO
2
980
Mathematics in the Age of AI and the 4 Generation University
hachama
0
170
言語モデルによるAI創薬の進展 / Advancements in AI-Driven Drug Discovery Using Language Models
tsurubee
2
380
2025/7/5 応用音響研究会招待講演@北海道大学
takuma_okamoto
1
100
Creation and environmental applications of 15-year daily inundation and vegetation maps for Siberia by integrating satellite and meteorological datasets
satai
3
130
Transparency to sustain open science infrastructure - Printemps Couperin
mlarrieu
1
190
20250605_新交通システム推進議連_熊本都市圏「車1割削減、渋滞半減、公共交通2倍」から考える地方都市交通政策
trafficbrain
0
520
Minimax and Bayes Optimal Best-arm Identification: Adaptive Experimental Design for Treatment Choice
masakat0
0
120
Weekly AI Agents News!
masatoto
33
68k
チャッドローン:LLMによる画像認識を用いた自律型ドローンシステムの開発と実験 / ec75-morisaki
yumulab
1
480
Submeter-level land cover mapping of Japan
satai
3
130
Featured
See All Featured
Optimising Largest Contentful Paint
csswizardry
37
3.3k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Gamification - CAS2011
davidbonilla
81
5.4k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
18
980
Building a Modern Day E-commerce SEO Strategy
aleyda
42
7.4k
GitHub's CSS Performance
jonrohan
1031
460k
YesSQL, Process and Tooling at Scale
rocio
173
14k
RailsConf 2023
tenderlove
30
1.1k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
29
9.6k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com