Summary

Expand taxonomies with heterogenous, unobserved

edge semantics for human-in-the-loop verification

Taxonomic Roles

with Linear Maps

Large-Margin Loss with

Dynamic Margins

Guarantees to Ease

Human Verification

ge Semantics

Jure Leskovec

interest, Stanford University

@{pinterest.com,cs.stanford.edu}

%$//

1%$

6+$4

4XHU\T

H

1%$

H

T

0

1%$

,67<3(2)

,63/$<(52)

,6/($*8(2)

,03/,&,7('*(6(0$17,&6

[

VT1%$

3

3

3

.

[

Z

1%$

and (u, , ) is the desired margin dened as a function of the

child, parent and non-parent nodes.

We now derive the loss function to be minimized in order to

satisfy the large-margin constraint (5). Denote by E(u, , 0) the

degree to which a non-parent node 0 violates the large-margin

constraint of child-parent pair (u, ):

E(u, , 0) = max[0,s(u, 0) s(u, ) + (u, , 0)]. (6)

When the large-margin constraint is satised, E(u, , 0) = 0 and

the non-parent incurs no violation. Otherwise, E(u, , 0) > 0.

The overall loss function L(T) is the total violation of the large-

margin constraints by the non-parents corresponding to every

child-parent pair (u, ):

L(T) =

’

(u, )2E

’

0 2V H(u)

E(u, , 0) (7)

The node embeddings w and linear-maps P1, . . . , Pk are jointly

trained to minimize L(T) via gradient-descent. Given the trained

parameters and a query node q < V having feature-vector eq, pre-

dictions are made by ranking the taxonomy nodes in decreasing

order of their taxonomic relatedness s(q, ).

Using the fact that

pairs and their cor

L(T)

Thus, minimizin

on the sum of shor

predictions and tr

dicted parent node

truth taxonomy. In

ages non-parent n

be scored relatively

This guarantee

experts; if A

node, the taxonom

around the predic

nd the correct pa

learned from the data [19].

We propose a principled dynamic margin func

no tuning, learning or heuristics. We relate th

shortest-path distances in the taxonomy between

true parent nodes. Denote by d(·, ·) the undirec

distance between two nodes in the taxonomy. W

theorem, we bound the undirected shortest-path

the highest-ranked predicted parent ˆ(u) = arg

any true parent for every child node u:

P 1. When (u, , 0) = d( , 0), L

bound on the sum of the undirected shortest-path

the highest-ranked predicted parents and true par

’

(u, )2E

d( , ˆ(u)) L(T).