Figure 2. Illustration on the 1-D torus. (top, left) histograms whose t associated to the singular vectors B1 , B2 , B3 for varying values of ⌧ right) convergence rate of the power iterations for ⌧ = 0.1, accordi proposition, proved in Appendix C, states that for large enough regularization, uniqueness and linear convergence are maintained. Proposition 2.6. For ⌧ large enough, the singular vectors are unique and the power iterations (4) converge linearly for k·k . When ⌧ ! 1, the singular vectors converge to Translated histograms: Mi,j = h(i − j) Unsupervised Ground Metric Learning Using Wasserstein Singular 0 20 40 60 80 Figure 2. Illustration on the 1-D torus. (top, left) histograms whose translations form B1 , B2 , B3 ; (b associated to the singular vectors B1 , B2 , B3 for varying values of ⌧ ; (top, right) functions h1 , h2 , right) convergence rate of the power iterations for ⌧ = 0.1, according to the d H metric. α0 αi i c(0,⋅) Input histograms Singular vector c are singular vectors of ( 1 A , 1 B ), with singular value 2 2. This proposition shows that for " = +1 a set of positive singular vectors is obtained as simply squared Euclidean distances over 1-D principal component embeddings of the data. Entropic regularization thus draws a link between our novel set of OT-based metric learning techniques and classical dimensionality reduction methods. This frames Sinkhorn singular vectors as a well-posed problem regard- less of the value of ". 5. Metric Learning for Single-Cell Genomics between precomputed Gene2Vec (Du et al., 2019) embed- dings. (Huizing et al., 2021) use a Sinkhorn divergence with a cosine distance between genes (i.e. vectors of cells) as a ground cost. In the present paper we compute OT distances using the Python package POT (Flamary et al., 2021). Dataset A commonly analyzed scRNA-seq dataset is the “PBMC 3k” dataset produced by 10X Genomics, obtained through the function pbmc3k of Scanpy (Wolf et al., 2018). Details on preprocessing and cell type annotation are given in Appendix H. The processed dataset contains m = 1030 genes and n = 2043 cells, each belonging to one of 6 immune cell types: ‘B cell’, ‘Natural Killer’, ‘CD4+ T cell’, ‘CD8+ T cell’, ‘Dendritic cell’ and ‘Monocyte’. The cell populations are heavily unbalanced. In addition, for each cell type we consider the set of canonical marker genes given by Azimuth (Hao et al., 2021), i.e. genes whose expression is characteristic of a certain cell type. Evaluation We use the annotation on cells (resp. on marker genes) to evaluate the quality of distances between cells (resp. between marker genes). We report in Table 1 and Table 2 the Average Silhouette Width (ASW), computed us- ing the function silhouette score of Scikit-learn (Pe- Unsupervised Ground Metric Learning Using Wasserstein Singular Vectors RNA-seq expression data : W Singular vector on cells c Singular vector on genes d genes genes and to a single-cell RNA sequencing dataset. In all cases, the ground metric learned iteratively is intu- itively interpretable. In particular, the ground metric learned on biological data not only leads to improved clustering, but also encodes biologically relevant infor- mation. Theoretical perspectives include further results on the existence of positive eigenvectors, in particular for ⌧ = 0 and for " > 0. In addition, integrating un- balanced optimal transport [38, 9] into the method could avoid the need for the step of normalization to histograms. Applying our method to large single cell datasets is also a promising avenue to extend the appli- cability of OT to new classes of problems in genomics. d Figure 9: Dataset, with genes arranged according to clustering of singular vector C Aur´ elien Bellet, Amaury Habrard, and Marc ebban. A survey on metric learning for fea- ure vectors and structured data. arXiv preprint rXiv:1306.6709, 2013. ethallah Benmansour, Guillaume Carlier, Gabriel Peyr´ e, and Filippo Santambrogio. Derivatives ith respect to metrics and applications: subgradi- nt marching algorithm. Numerische Mathematik, 16(3):357–381, 2010. Guillaume Carlier, Arnaud Dupuy, Alfred Gali- hon, and Yifei Sun. Sista: learning optimal ransport costs under sparsity constraints. arXiv reprint arXiv:2009.08564, 2020. Guillaume Carlier, Vincent Duval, Gabriel Peyr´ e, nd Bernhard Schmitzer. Convergence of en- ropic schemes for optimal transport and gradient ows. SIAM Journal on Mathematical Analysis, 9(2):1385–1418, 2017. ´ ena¨ ıc Chizat, Gabriel Peyr´ e, Bernhard Schmitzer, nd Fran¸ cois-Xavier Vialard. Unbalanced optimal ransport: Dynamic and kantorovich formulations. ournal of Functional Analysis, 274(11):3090–3123, 018. enaic Chizat, Pierre Roussillon, Flavien L´ eger, ran¸ cois-Xavier Vialard, and Gabriel Peyr´ e. Faster asserstein distance estimation with the sinkhorn ivergence. In Proc. NeurIPS’20, 2020. Marco Cuturi. Sinkhorn distances: Lightspeed omputation of optimal transport. In Adv. in Neu- al Information Processing Systems, pages 2292– 300, 2013. Marco Cuturi and David Avis. Ground metric earning. The Journal of Machine Learning Re- earch, 15(1):533–564, 2014. ason V Davis and Inderjit S Dhillon. Structured metric learning for high dimensional problems. In Proceedings of the 14th ACM SIGKDD tional conference on Knowledge discovery a mining, pages 195–203, 2008. [14] Arnaud Dupuy, Alfred Galichon, and Y Estimating matching a nity matrices un rank constraints. Information and Infer Journal of the IMA, 8(4):677–689, 2019. [15] Jean Feydy, Thibault S´ ejourn´ e, Fran¸ cois Vialard, Shun-ichi Amari, Alain Trou Gabriel Peyr´ e. Interpolating between transport and mmd using sinkhorn diverge The 22nd International Conference on A Intelligence and Statistics, pages 2681–269 [16] R´ emi Flamary and Nicolas Courty. Pot optimal transport library, 2017. [17] R´ emi Flamary, Marco Cuturi, Nicolas Cou Alain Rakotomamonjy. Wasserstein discr analysis. Machine Learning, 107(12):192 2018. [18] Alfred Galichon and Bernard Salani´ e. invisible hand: Social surplus and ident in matching models. Available at SSRN 1 2020. [19] A. Genevay, G. Peyr´ e, and M. Cuturi. L generative models with sinkhorn diverge Proc. AISTATS’18, pages 1608–1617, 201 [20] Aude Genevay, L´ enaic Chizat, Franci Marco Cuturi, and Gabriel Peyr´ e. Samp plexity of sinkhorn divergences. In The 2 ternational Conference on Artificial Inte and Statistics, pages 1574–1583. PMLR, [21] Alison L Gibbs and Francis Edward Su. O ing and bounding probability metrics. tional statistical review, 70(3):419–435, 20 [22] Alexandre Gramfort, Gabriel Peyr´ e, and Cuturi. Fast optimal transport averaging roimaging data. In International Confer 10 M M