Contrastive learning unifies t-SNE and UMAP

Neighbor embedding methods $t$-SNE and UMAP are the de facto standard for visualizing high-dimensional datasets. They appear to use very different loss functions with different motivations, and the exact relationship between them has been unclear. Here we show that UMAP is effectively negative sampl...

Full description

Saved in:
Bibliographic Details
Main Authors: Damrich, Sebastian (Author) , Böhm, Jan Niklas (Author) , Hamprecht, Fred (Author) , Kobak, Dmitry (Author)
Format: Article (Journal) Chapter/Article
Language:English
Published: 3 Jun 2022
In: Arxiv
Year: 2022, Pages: 1-29
DOI:10.48550/arXiv.2206.01816
Online Access:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.48550/arXiv.2206.01816
Verlag, lizenzpflichtig, Volltext: http://arxiv.org/abs/2206.01816
Get full text
Author Notes:Sebastian Damrich, Jan Niklas Böhm, Fred A. Hamprecht, Dmitry Kobak
Description
Summary:Neighbor embedding methods $t$-SNE and UMAP are the de facto standard for visualizing high-dimensional datasets. They appear to use very different loss functions with different motivations, and the exact relationship between them has been unclear. Here we show that UMAP is effectively negative sampling applied to the $t$-SNE loss function. We explain the difference between negative sampling and noise-contrastive estimation (NCE), which has been used to optimize $t$-SNE under the name NCVis. We prove that, unlike NCE, negative sampling learns a scaled data distribution. When applied in the neighbor embedding setting, it yields more compact embeddings with increased attraction, explaining differences in appearance between UMAP and $t$-SNE. Further, we generalize the notion of negative sampling and obtain a spectrum of embeddings, encompassing visualizations similar to $t$-SNE, NCVis, and UMAP. Finally, we explore the connection between representation learning in the SimCLR setting and neighbor embeddings, and show that (i) $t$-SNE can be optimized using the InfoNCE loss and in a parametric setting; (ii) various contrastive losses with only few noise samples can yield competitive performance in the SimCLR setup.
Item Description:Gesehen am 18.10.2022
Physical Description:Online Resource
DOI:10.48550/arXiv.2206.01816