Metric Recovery from Unweighted k-NN Graphs

1 KYOTO UNIVERSITY KYOTO UNIVERSITY Metric Recovery from Unweighted k-NN
Graphs Ryoma Sato

2 / 45 KYOTO UNIVERSITY I introduce my favorite topic
and its applications  Metric recovery from unweighted k-NN graphs is my recent favorite technique. I like this technique because The scope of applications is broad, and The results are simple but non-trivial.  I first introduce this problem.  I then introduce my recent projects that used this technique. - Towards Principled User-side Recommender Systems (CIKM 2022) - Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure (ICML 2023)

3 / 45 KYOTO UNIVERSITY Metric Recovery from Unweighted k-NN
Graphs Morteza Alamgir, Ulrike von Luxburg. Shortest path distance in random k-nearest neighbor graphs. ICML 2012. Tatsunori Hashimoto, Yi Sun, Tommi Jaakkola. Metric recovery from directed unweighted graphs. AISTATS 2015.

4 / 45 KYOTO UNIVERSITY k-NN graph is generated from
a point cloud  We generate a k-NN graph from a point cloud.  Then, we discard the coordinates of nodes. generate edges discard coordinates nodes have coordinates for visualization but they are random

5 / 45 KYOTO UNIVERSITY Metric recovery asks to estimate
the coodinates  The original coordinates are hidden now.  Metric recovery from unweighted k-NN graphs is a problem of estimating the coordinates from the k-NN graph. estimate

6 / 45 KYOTO UNIVERSITY Only the existences of edges
are observable  Unweighted means the edge lengths are neither available.  This is equivalent to the setting where only the 01-adjacency matrix of the k-NN graph is available. estimate

7 / 45 KYOTO UNIVERSITY Given 01-adjacency, estimate the coordinates
 Problem (Metric Recovery from Unweighted k-NN Graphs) In: The 01-adjacency matrix of a k-NN graph Out: The latent coordinates of the nodes  Very simple. estimate

8 / 45 KYOTO UNIVERSITY Why Is This Problem Challenging?

9 / 45 KYOTO UNIVERSITY Standard node embedding methods fail
 The type of this problem is node embedding. I.e., In: graph, Out: node embeddings.  However, the following example tells standard embeddings techniques fail.

10 / 45 KYOTO UNIVERSITY Distance is opposite in the
graph and latent space  The shortest-path distance between nodes A and B is 21. The shortest-path distance between nodes A and C is 18.  Standard node embedding methods would embed node C closer to A than node B to A, which is not consistent with the ground truth latent coordinates. 10-NN graph The coordinates are supposed to be hidden, but I show them for illustration.

11 / 45 KYOTO UNIVERSITY Critical assumption does not hold
 Embedding nodes that are close in the input graph close is the critical assumption in various embedding methods.  This assumption does NOT hold in our situation. 10-NN graph The coordinates are supposed to be hidden, but I show them for illustration.

12 / 45 KYOTO UNIVERSITY Solution

13 / 45 KYOTO UNIVERSITY Edge lengths are important 
Why the previous example fails?  If the edge lengths were took into consideration, the shortest path distance would be a consistent estimator of the latent distance.  Step 1: Estimate the latent edge lengths. 10-NN graph The coordinates are supposed to be hidden, but I show them for illustration.

14 / 45 KYOTO UNIVERSITY Densities are important  Observation:
Edges are longer in sparse regions and shorter in dense regions.  Step 2: Estimate the densities.  But how? We do not know the coordinates of the points... 10-NN graph The coordinates are supposed to be hidden, but I show them for illustration.

15 / 45 KYOTO UNIVERSITY Density can be estimated from
PageRank  Solution: A PageRank-like estimator solves it. The stationary distribution of random walks (plus a simple transformation) is a consistent estimator of the density.  The higher the rank is, the denser there is.  This can be computed solely from the unweighted graph. 10-NN graph Stationary distribution of simple random walks ≈ PageRank

16 / 45 KYOTO UNIVERSITY Given 01-adjacency, estimate the coordinates
 Problem definition (again) In: The 01-adjacency matrix of a k-NN graph Out: The latent coordinates of the nodes  Very simple. estimate

17 / 45 KYOTO UNIVERSITY Procedure to estimate the coordinates
1. Compute the stationary distribution of random walks. 2. Estimate the density around each node. 3. Estimate the edge lengths using the estimated densities. 4. Compute the shortest path distances using the estimated edge lengths and compute the distance matrix. 5. Estimate the coordinates from the distance matrix by, e.g., multidimentional scaling.  This is a consistent estimator [Hashimoto+ AISTATS 2015]. Tatsunori Hashimoto, Yi Sun, Tommi Jaakkola. Metric recovery from directed unweighted graphs. AISTATS 2015. (up to rigid transform)

18 / 45 KYOTO UNIVERSITY We can recover the coordinates
consistently The latent coordinates can be consistently estimated solely from the unweighted k-NN graph. Take Home Message

19 / 45 KYOTO UNIVERSITY Towards Principled User-side Recommender Systems
(CIKM 2022) Ryoma Sato. Towards Principled User-side Recommender Systems. CIKM 2022.

20 / 45 KYOTO UNIVERSITY Let’s consider item-to-item recommendations 
We consider item-to-item recommendations.  Ex: “Products related to this item” panel in Amazon.com.

21 / 45 KYOTO UNIVERSITY User-side recsys realizes user’s desiderata
 Problem: We are unsatisfactory with the official recommender system.  It provides monotone recommendations. We need serendipity.  It provides recommendations biased towards specific companies or countries.  User-side recommender systems [Sato 2022] enable users to build their own recommender systems that satisfy their desiderata even when the official one does not support them. Ryoma Sato. Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? SDM 2022.

22 / 45 KYOTO UNIVERSITY We need powerful and principled
user-side Recsys  [Sato 2022]’s user-side recommender system is realized in an ad-hoc manner, and the performance is not so high.  We need a way to build user-side recommender systems in a systematic manner and a more powerful one. Hopefully one that is as strong as the official one. Ryoma Sato. Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? SDM 2022.

23 / 45 KYOTO UNIVERSITY Official (traditional) recommender systems Recsys
Algorithm log data catalog auxiliary data Ingredients Recsys model sourece item Step 1. training Step 2. inference recommendations Official (traditional) recsys

24 / 45 KYOTO UNIVERSITY Users cannot see the data,
algorithm, and model Recsys Algorithm log data catalog auxiliary data Ingredients Recsys model sourece item recommendations These parts are not observable for users (industrial secrets)

25 / 45 KYOTO UNIVERSITY How can we build our
Recsys without them? Recsys Algorithm log data catalog auxiliary data Ingredients Recsys model sourece item recommendations But they are crucial information to build new Recsys...

26 / 45 KYOTO UNIVERSITY We assume the model is
embedding-based Recsys Algorithm log data catalog auxiliary data Ingredients Recsys model sourece item recommendations (Slight) Assumption: The model embeds items and recommends near items. This is a common strategy in Recsys. We do not assume the way it embeds. It can be matrix factorization, neural networks, etc.

27 / 45 KYOTO UNIVERSITY We can observe k-NN graph
of the embeddings Recsys Algorithm log data catalog auxiliary data Ingredients Recsys model sourece item recommendations Observation: These outputs have sufficient information to construct the unweighted k-NN graph. I.e., users can build the k-NN graph by accessing each item page, and observing what the neighboring items are.

28 / 45 KYOTO UNIVERSITY We can estimate the embeddings!
Recsys Algorithm log data catalog auxiliary data Ingredients Recsys model sourece item recommendations Solution: Estimate the item embeddings of the official Recsys. They are considered to be secret, but we can estimate them from the weighted k-NN graph! They contain much information!

29 / 45 KYOTO UNIVERSITY We realize our desiderata with
the embeddings  We can do many things with the estimated embeddings.  We can compute recommendations by ourselves and with our own postprocessings.  If you want more serendipity, recommend 1st, 2nd, 4th, 8th, ... and 32nd nearest items or add noise to the embeddings.  If you want to decrease the bias to specific companies, add negative biases to the score of these items so as to suppress these companies.

30 / 45 KYOTO UNIVERSITY Experiments validated the theory 
In the experiments I conducted simulations and showed that the hidden item embeddings can be estimated accurately. I built a fair Recsys for Twitter, which runs in the real-world, on the user’s side. Even though the official Recsys is not fair w.r.t. gender, mine is, and it is more efficient than the existing one.

31 / 45 KYOTO UNIVERSITY Users can recover the item
embeddings Users can “reverse engineer” the official item embeddings solely from the observable information. Take Home Message

32 / 45 KYOTO UNIVERSITY Graph Neural Networks can Recover
the Hidden Features Solely from the Graph Structure (ICML 2023) Ryoma Sato. Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure. ICML 2023.

33 / 45 KYOTO UNIVERSITY We call for the theory
for GNNs  Graph Neural Networks (GNNs) take a graph with node features as input and output node embeddings.  GNNs is a popular choice in various graph-related tasks.  GNNs are so popular that understanding GNNs by theory is an important topic in its own right. e.g., What is the hypothesis space of GNNs? (GNNs do not have a universal approximation power.) Why GNNs work well in so many tasks?

34 / 45 KYOTO UNIVERSITY GNNs apply filters to node
features  GNNs apply filters to the input node features and extract useful features.  The input node features have long been considered to be the key to success. If the features have no useful signals, GNNs will not work.

35 / 45 KYOTO UNIVERSITY Good node features are not
always available  However, informative node features are not always available.  E.g., social network user information may be hidden for privacy reasons.

36 / 45 KYOTO UNIVERSITY Uninformative features degrade the performance
 If we have no features at hand, we usually input uninformative node features such as the degree features.  No matter how such features are filtered, only uninformative embeddings are obtained. “garbage in, garbage out.” This is common sense.

37 / 45 KYOTO UNIVERSITY Can GNNs work with uninformative
node features?  Research question I want to answer in this project: Do GNNs really not work when the input node features are uninformative?  In practice, GNNs sometimes work just with degree features. The reason is a mystery, which I want to elucidate.

38 / 45 KYOTO UNIVERSITY We assume latent node features
behind the graph  (Slight) Assumption: The graph structure is formed by connecting nodes whose latent node features z* v are close to each other.  The latent node features z* v are not an observable e.g., "true user preference vector" Latent features that contain users’ preferences, workplace, residence, etc. Those who have similar preferences and residence have connections. We can only observe the way they are connected, not the coordinates.

39 / 45 KYOTO UNIVERSITY GNNs can recover the lantent
feature  Main results: GNNs can recover the latent node features z* v even when the input node features are uninformative.  z* v contains the preferences of users, which is useful for tasks.

40 / 45 KYOTO UNIVERSITY GNNs create useful node features
themselves  GNNs can create completely new and useful node features by absorbing information from the graph structure, even when the input node features are uninformative.  A new perspective that overturns the existing view of filtering input node features.

41 / 45 KYOTO UNIVERSITY GNNs can recover the coordinates
with some tricks  How to prove it? → Metric recovery from k-NN graphs as you may expect.  But be careful when you apply it. What GNNs can do (the hypothesis space of GNNs) is limited.  The metric recovery algorithm is compatible with GNNs. Stationary distribution → GNNs can do random walks. Shortest path → GNNs can simulate Bellman-Ford. MDS → This is a bit tricky part. We send the matrix to some nodes and solve it locally.  GNNs can recover the metric with slight additional errors.

42 / 45 KYOTO UNIVERSITY Recovered features are empicirally useful
 In the experiments, We empirically confirmed this phenomenon. The recovered features are useful for various downstream tasks, even when the input features x syn are uninformative.

43 / 45 KYOTO UNIVERSITY GNNs can create useful features
by themselves GNNs can create useful node features by absorbing information from the underlying graph. Take Home Message

44 / 45 KYOTO UNIVERSITY Conclusion

45 / 45 KYOTO UNIVERSITY I introduced my favorite topic
and its applications  Metric recovery from unweighted k-NN graphs is my recent favorite technique. I like this technique because The scope of applications is broad, and The results are simple but non-trivial. The latent coordinates can be consistently estimated solely from the unweighted k-NN graph. Take Home Message

Metric Recovery from Unweighted k-NN Graphs

Metric Recovery from Unweighted k-NN Graphs

More Decks by 佐藤竜馬 (Ryoma Sato)

Other Decks in Science

Featured

Transcript