Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an...
Search
Thiago Marzagão
May 28, 2017
Research
0
3.2k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
940
SICSS presentation
thiagomarzagao
0
920
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.8k
mineração de dados
thiagomarzagao
0
2.5k
mineração de dados no governo
thiagomarzagao
1
3.1k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
270
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3.1k
Mineração de Dados no Governo Federal
thiagomarzagao
0
120
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
76
Other Decks in Research
See All in Research
情報処理学会関西支部2024年度定期講演会「自然言語処理と大規模言語モデルの基礎」
ksudoh
10
2.3k
ナレッジプロデューサーとしてのミドルマネージャー支援 - MIMIGURI「知識創造室」の事例の考察 -
chiemitaki
0
160
ソフトウェア研究における脅威モデリング
laysakura
0
1.2k
Tietovuoto Social Design Agency (SDA) -trollitehtaasta
hponka
0
3.3k
marukotenant01/tenant-20240916
marketing2024
0
650
精度を無視しない推薦多様化の評価指標
kuri8ive
1
340
LLM時代にLabは何をすべきか聞いて回った1年間
hargon24
1
590
医療支援AI開発における臨床と情報学の連携を円滑に進めるために
moda0
0
140
論文読み会 KDD2024 | Relevance meets Diversity: A User-Centric Framework for Knowledge Exploration through Recommendations
cocomoff
0
140
研究を支える拡張性の高い ワークフローツールの提案 / Proposal of highly expandable workflow tools to support research
linyows
0
250
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
sansan_randd
1
420
ベイズ的方法に基づく統計的因果推論の基礎
holyshun
0
710
Featured
See All Featured
Why Our Code Smells
bkeepers
PRO
335
57k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
120k
Documentation Writing (for coders)
carmenintech
67
4.5k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
Embracing the Ebb and Flow
colly
84
4.5k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
232
17k
RailsConf 2023
tenderlove
29
970
Understanding Cognitive Biases in Performance Measurement
bluesmoon
27
1.5k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
45
2.3k
Gamification - CAS2011
davidbonilla
80
5.1k
Thoughts on Productivity
jonyablonski
68
4.4k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
28
2.2k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com