Correcting Nuisance Variation using Wasserstein Distance

Tabak, Gil; Fan, Minjie; Yang, Samuel J.; Hoyer, Stephan; Davis, Geoff

Abstract:Profiling cellular phenotypes from microscopic imaging can provide meaningful biological information resulting from various factors affecting the cells. One motivating application is drug development: morphological cell features can be captured from images, from which similarities between different drugs applied at different dosages can be quantified. The general approach is to find a function mapping the images to an embedding space of manageable dimensionality whose geometry captures relevant features of the input images. An important known issue for such methods is separating relevant biological signal from nuisance variation. For example, the embedding vectors tend to be more correlated for cells that were cultured and imaged during the same week than for cells from a different week, despite having identical drug compounds applied in both cases. In this case, the particular batch a set of experiments were conducted in constitutes the domain of the data; an ideal set of image embeddings should contain only the relevant biological information (e.g. drug effects). We develop a method for adjusting the image embeddings in order to `forget' domain-specific information while preserving relevant biological information. To do this, we minimize a loss function based on the Wasserstein distance. We find for our transformed embeddings (1) the underlying geometric structure is preserved and (2) less domain-specific information is present.

Comments:	11 pages, 5 figures
Subjects:	Machine Learning (stat.ML)
Cite as:	arXiv:1711.00882 [stat.ML]
	(or arXiv:1711.00882v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1711.00882

Statistics > Machine Learning

Title:Correcting Nuisance Variation using Wasserstein Distance

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators