Summary
Expand taxonomies with heterogenous, unobserved
edge semantics for human-in-the-loop verification
Taxonomic Roles
with Linear Maps
Large-Margin Loss with
Dynamic Margins
Guarantees to Ease
Human Verification
ge Semantics
Jure Leskovec
interest, Stanford University
@{pinterest.com,cs.stanford.edu}
%$//
1%$
6+$4
4XHU\T
H
1%$
H
T
0
1%$
,67<3(2)
,63/$<(52)
,6/($*8(2)
,03/,&,7('*(6(0$17,&6
[
VT1%$
3
3
3
.
[
Z
1%$
and (u, , ) is the desired margin dened as a function of the
child, parent and non-parent nodes.
We now derive the loss function to be minimized in order to
satisfy the large-margin constraint (5). Denote by E(u, , 0) the
degree to which a non-parent node 0 violates the large-margin
constraint of child-parent pair (u, ):
E(u, , 0) = max[0,s(u, 0) s(u, ) + (u, , 0)]. (6)
When the large-margin constraint is satised, E(u, , 0) = 0 and
the non-parent incurs no violation. Otherwise, E(u, , 0) > 0.
The overall loss function L(T) is the total violation of the large-
margin constraints by the non-parents corresponding to every
child-parent pair (u, ):
L(T) =
’
(u, )2E
’
0 2V H(u)
E(u, , 0) (7)
The node embeddings w and linear-maps P1, . . . , Pk are jointly
trained to minimize L(T) via gradient-descent. Given the trained
parameters and a query node q < V having feature-vector eq, pre-
dictions are made by ranking the taxonomy nodes in decreasing
order of their taxonomic relatedness s(q, ).
Using the fact that
pairs and their cor
L(T)
Thus, minimizin
on the sum of shor
predictions and tr
dicted parent node
truth taxonomy. In
ages non-parent n
be scored relatively
This guarantee
experts; if A
node, the taxonom
around the predic
nd the correct pa
learned from the data [19].
We propose a principled dynamic margin func
no tuning, learning or heuristics. We relate th
shortest-path distances in the taxonomy between
true parent nodes. Denote by d(·, ·) the undirec
distance between two nodes in the taxonomy. W
theorem, we bound the undirected shortest-path
the highest-ranked predicted parent ˆ(u) = arg
any true parent for every child node u:
P 1. When (u, , 0) = d( , 0), L
bound on the sum of the undirected shortest-path
the highest-ranked predicted parents and true par
’
(u, )2E
d( , ˆ(u)) L(T).