Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Making Scores with HiScore
Search
Hakka Labs
February 13, 2015
Programming
0
3.4k
Making Scores with HiScore
Video here:
Hakka Labs
February 13, 2015
Tweet
Share
More Decks by Hakka Labs
See All by Hakka Labs
New Workflows for Building Data Pipelines
hakka_labs
0
2.9k
Collaborative Topic Models for Users and Texts
hakka_labs
0
2.8k
Groupcache with Evan Owen
hakka_labs
2
5.3k
Testing Android at Spotify
hakka_labs
1
4.5k
It's Not a Bug, It's a Feature!
hakka_labs
0
3.2k
K-means Clustering to Understand Your Users
hakka_labs
0
2k
Building Amy: The Email-based Virtual Assistant by x.ai
hakka_labs
0
5k
Deep Learning and NLP Applications
hakka_labs
3
13k
Go and the Gophers
hakka_labs
2
11k
Other Decks in Programming
See All in Programming
仕様変更に耐えるための"今の"DRY原則を考える
mkmk884
9
3.3k
ソフトウェアエンジニアの成長
masuda220
PRO
12
2.2k
良いコードレビューとは
danimal141
9
7.9k
もう少しテストを書きたいんじゃ〜 #phpstudy
o0h
PRO
21
4.3k
Django NinjaによるAPI開発の効率化とリプレースの実践
kashewnuts
1
300
AIプログラミング雑キャッチアップ
yuheinakasaka
20
5.2k
「個人開発マネタイズ大全」が教えてくれたこと
bani24884
1
290
『テスト書いた方が開発が早いじゃん』を解き明かす #phpcon_nagoya
o0h
PRO
9
2.6k
コードを読んで理解するko build
bells17
1
120
やっと腹落ち「スプリント毎に動くモノをリリースする」〜ゼロから始めるメガバンクグループのアジャイル実践〜
sasakendayo
0
150
PHPカンファレンス名古屋2025 タスク分解の試行錯誤〜レビュー負荷を下げるために〜
soichi
1
750
sappoRo.R #12 初心者セッション
kosugitti
0
280
Featured
See All Featured
Designing for humans not robots
tammielis
250
25k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
4
440
Statistics for Hackers
jakevdp
797
220k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
46
2.4k
GraphQLとの向き合い方2022年版
quramy
44
14k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Faster Mobile Websites
deanohume
306
31k
Designing for Performance
lara
605
68k
How to Think Like a Performance Engineer
csswizardry
22
1.4k
Git: the NoSQL Database
bkeepers
PRO
428
65k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
115
51k
Transcript
Making Scores with HiScore Abe Othman
None
None
None
None
HiScore is a python library for creating and maintaining scores
It uses a novel quasi-Kriging solution to a new methodology,
supervised scoring
What are scores?
Scores are a tool for domain experts to communicate their
expertise to a broad audience
88 51 27
} 58 Score Function Dimensions Score
There is no one correct scoring function
Scores are typically developed using the dual approach
1. Select a set of basis functions f(x ⃗) =
∑ γᵢφᵢ(x ⃗)
2. Adjust coefficients until things look right f(x ⃗) =
∑ γᵢφᵢ(x ⃗)
Dual scores ossify
Walkscore Problems Score of 100, but the highest crime in
SF
Supervised scoring: a primal approach
Experts start by labeling a reference set and the objects’
dimensions
Algorithm makes a scoring function that interpolates and obeys the
monotone relationship
Some nice features
Monotonicity is important for score acceptance and understanding
See a mis-scored point? Add it to the reference set
and re-run!
OK, but what algorithm?
Easy in one dimension
None
None
None
Hard in many dimensions
Failed approach: simplical interpolation
None
Failed approach: B-spline product bases
Supervised Scoring with Monotone Multidimensional Splines, AAAI 2014
Curse of dimensionality!
None
None
None
Failed approach: RBF with monotone row generation constraints
Failed approach: Neural Networks
None
None
Success: Beliakov
Reminder: Lipschitz Continuity |f(a)-f(b)| < C |a-b|
None
Monotone Lipschitz continuity
None
1. Project monotone Lipschitz cones from each point to generate
upper and lower bounds
2. Find the sup and inf constraints from the bounding
cones
3. Function value is halfway in-between the sup and inf
bounds
Beliakov example
Beliakov plateaux
Beliakov plateaux
How can we smooth and improve this?
Abandon Lipschitz, just project minimal cones from each point
None
`
HiScore solution
Using HiScore: Simplified Water Well Score
None
None
Two factors: Distance from nearest latrine and platform size
Label a reference set by taking high, middle and low
values in each dimension
Distance: 0m, 10m, 50m Size: 1SF, 25SF, 100SF
Score Distance Size 0 0 1 5 0 25 10
0 100 20 10 1 50 10 25 60 10 100 65 50 1 90 50 25 100 50 100 Monotone Relationship: (+, +)
import hiscore reference_set = {(0,1): 0, (0,25): 5, (0,100): 10,
(10,1): 20, (10,25): 50, … } mono_rel = [1,1] hiscore.create(reference_set, mono_rel, minval=0, maxval=100)
None
Complicate the model with additional factors
Avoid curse of dimensionality by building a tree
None
Possible to easily construct and understand scores with dozens of
input dimensions
Making dimensions monotone: blood pressure
None
S+ > 0 S- = 0 D+ > 0 D-
= 0 D+ = 0 D- > 0 S+ = 0 S- > 0
What do you want to score? github.com/aothman/ hiscore $ pip
install hiscore
Thanks! aothman@cs.cmu.edu