Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.3k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
970
SICSS presentation
thiagomarzagao
0
940
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.8k
mineração de dados
thiagomarzagao
0
2.5k
mineração de dados no governo
thiagomarzagao
1
3.1k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
280
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.1k
Mineração de Dados no Governo Federal
thiagomarzagao
0
120
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
79
Other Decks in Research
See All in Research
Dynamic World, Near real-time global 10 m land use land cover mapping
satai
3
110
IM2024
mamoruk
0
250
Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment
satai
3
180
打率7割を実現する、プロダクトディスカバリーの7つの極意(pmconf2024)
geshi0820
0
410
Weekly AI Agents News! 12月号 論文のアーカイブ
masatoto
0
250
SI-D案内資料_京都文教大学
ryojitakeuchi1116
0
230
AIトップカンファレンスからみるData-Centric AIの研究動向 / Research Trends in Data-Centric AI: Insights from Top AI Conferences
tsurubee
3
2.1k
Poster: Feasibility of Runtime-Neutral Wasm Instrumentation for Edge-Cloud Workload Handover
chikuwait
0
430
インドネシアのQA事情を紹介するの
yujijs
0
170
作業記憶の発達的特性が言語獲得の臨界期を形成する(NLP2025)
chemical_tree
1
370
NLP2025SharedTask翻訳部門
moriokataku
0
260
eAI (Engineerable AI) プロジェクトの全体像 / Overview of eAI Project
ishikawafyu
0
420
Featured
See All Featured
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
30k
How to Think Like a Performance Engineer
csswizardry
22
1.5k
Into the Great Unknown - MozCon
thekraken
36
1.7k
Faster Mobile Websites
deanohume
306
31k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
4
470
Build your cross-platform service in a week with App Engine
jlugia
229
18k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
25k
Speed Design
sergeychernyshev
28
870
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Automating Front-end Workflow
addyosmani
1369
200k
Stop Working from a Prison Cell
hatefulcrawdad
268
20k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
8
700
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun tsun9920@gmail.com Leonardo Sales leonado.sales@cgu.gov.br
Observatory of Public Spending @tmarzagao thiagomarzagao.com