Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Predicting irregularities in public bidding: an application of neural networks
Search
Thiago Marzagão
May 28, 2017
Research
0
3.1k
Predicting irregularities in public bidding: an application of neural networks
Thiago Marzagão
May 28, 2017
Tweet
Share
More Decks by Thiago Marzagão
See All by Thiago Marzagão
Aula inagural na ENAP
thiagomarzagao
0
860
SICSS presentation
thiagomarzagao
0
840
antitrust uses and misuses (in the age of Big Data)
thiagomarzagao
1
1.7k
mineração de dados
thiagomarzagao
0
2.4k
mineração de dados no governo
thiagomarzagao
1
3k
Using AI to fight corruption in the Brazilian government
thiagomarzagao
0
250
Uso de Técnicas de Mineração de Dados no Monitoramento dos Gastos Públicos e no Combate à Corrupção
thiagomarzagao
0
3k
Mineração de Dados no Governo Federal
thiagomarzagao
0
120
Classificação Automatizada de Produtos e Serviços Licitados
thiagomarzagao
0
75
Other Decks in Research
See All in Research
[輪講資料] Text Embeddings by Weakly-Supervised Contrastive Pre-training
hpprc
3
720
LayerXにおけるAI・機械学習技術の活用と展望 / layerx-ai-jsai2024
shimacos
2
2.5k
SSII2024 [TS3] 画像認識におけるマルチモーダル基盤モデル ~基盤モデル、あなたのタスクに役立つかも?~
ssii
PRO
0
810
SSII2024 [OS3] 基盤モデル(オープニング)
ssii
PRO
0
280
LINEチャットボット「全力肯定彼氏くん(LuC4)」の 1年を振り返る
o_ob
0
680
【ICASSP2024】音声変換に関する全論文まとめ【Parakeet株式会社】
supikiti
0
600
SSII2024 [SS1] 拡散モデルの今 〜 2024年の研究動向 〜
ssii
PRO
2
1.9k
-SSII技術マップを通して見る過去・現在,そして未来-
hf149
1
490
How to Perform Manual Classification for Deep Learning Using CloudCompare
kentaitakura
0
950
第28回 著者ゼミ:Identification of drug responsible glycogene signature in liver carcinoma from meta-analysis using RNA-seq data
ktatsuya
2
200
The Future of AI: Beyond Completion Models to Systematic Innovation
sunghopark0
0
120
SSII2024 [OS1] 画像生成技術の発展: 過去10年の軌跡と未来への展望
ssii
PRO
3
1.5k
Featured
See All Featured
The Pragmatic Product Professional
lauravandoore
29
6.1k
BBQ
matthewcrist
82
9k
Testing 201, or: Great Expectations
jmmastey
33
6.9k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
35
6.3k
VelocityConf: Rendering Performance Case Studies
addyosmani
321
23k
What the flash - Photography Introduction
edds
65
11k
Art, The Web, and Tiny UX
lynnandtonic
291
20k
5 minutes of I Can Smell Your CMS
philhawksworth
200
19k
Intergalactic Javascript Robots from Outer Space
tanoku
266
26k
How to name files
jennybc
67
96k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
29
2.5k
A Philosophy of Restraint
colly
200
16k
Transcript
Predicting irregularities in public bidding: an application of neural networks
Observatory of Public Spending
Government contractor doesn’t pay employees Default epidemy in the federal
government: 4 companies went bankrupt Construction company abandons 3 projects Observatory of Public Spending
Observatory of Public Spending what if we could predict which
contractors will become headaches?
Observatory of Public Spending
Observatory of Public Spending impossible to do manually ~25k new
contracts every year
Observatory of Public Spending
Observatory of Public Spending data + neural networks = predictions
Observatory of Public Spending data: - n = 10186 -
9442 (~93%) not problem - 744 (~ 7%) problem - 2011-2016
Observatory of Public Spending data: - Y: has the company
been punished before?
Observatory of Public Spending data: - X: a total of
183 attributes, like: - # of employees - average salary of employees - # of auctions it participated - donated $ to politicians? - …
Observatory of Public Spending neural networks: - two approaches: -
(“traditional”) neural network - deep neural network
Observatory of Public Spending TNN: - 2 hidden layers -
can’t handle 183 attributes - hence must use PCA first
Observatory of Public Spending TNN: - PCA - selected 24
continuous variables based on covariance matrix - PCA reduced 24 variables to 9 components (~70% of variance; all components w/ eigenvalue > 1)
Observatory of Public Spending TNN: - 9 components + 21
binary vars. - 80% training - w/ oversampling - 20% testing - boosting (10 models)
Observatory of Public Spending DNN: - 3 hidden layers -
hundreds of neurons - can handle all 183 variables - can handle complex relationships between the variables
Observatory of Public Spending DNN: - all 183 variables (no
PCA) - no oversampling - 80% training - 20% testing - 5-fold cross-validation
Observatory of Public Spending
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending how can we evaluate performance? -
accuracy (% of correct predictions overall) - recall (% of problems predicted to be problems) - precision (% of predicted problems that are problems)
Observatory of Public Spending results: - TNN precision: 0.24 -
DNN precision: 0.79 - huge difference! extra computational cost of DNN is worth it
Observatory of Public Spending to do: - improve recall -
0.58 w/ TNN - 0.26 w/ DNN - change the law - must allow gov not to contract w/ high risk companies
Observatory of Public Spending Ting Sun
[email protected]
Leonardo Sales
[email protected]
Observatory of Public Spending @tmarzagao thiagomarzagao.com