Cross-Modal Common Representation Learning with Triplet Loss Functions

Ott, Felix; Rügamer, David; Heublein, Lucas; Bischl, Bernd; Mutschler, Christopher

Computer Science > Machine Learning

arXiv:2202.07901v1 (cs)

[Submitted on 16 Feb 2022 (this version), latest version 3 Aug 2023 (v3)]

Title:Cross-Modal Common Representation Learning with Triplet Loss Functions

Authors:Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, Christopher Mutschler

View PDF

Abstract:Common representation learning (CRL) learns a shared embedding between two or more modalities to improve in a given task over using only one of the modalities. CRL from different data types such as images and time-series data (e.g., audio or text data) requires a deep metric learning loss that minimizes the distance between the modality embeddings. In this paper, we propose to use the triplet loss, which uses positive and negative identities to create sample pairs with different labels, for CRL between image and time-series modalities. By adapting the triplet loss for CRL, higher accuracy in the main (time-series classification) task can be achieved by exploiting additional information of the auxiliary (image classification) task. Our experiments on synthetic data and handwriting recognition data from sensor-enhanced pens show an improved classification accuracy, faster convergence, and a better generalizability.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
MSC classes:	68T30, 68T35
ACM classes:	I.2.4
Cite as:	arXiv:2202.07901 [cs.LG]
	(or arXiv:2202.07901v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2202.07901

Submission history

From: Felix Ott [view email]
[v1] Wed, 16 Feb 2022 07:09:04 UTC (20,457 KB)
[v2] Wed, 18 Jan 2023 08:00:01 UTC (20,418 KB)
[v3] Thu, 3 Aug 2023 11:36:06 UTC (21,919 KB)

Computer Science > Machine Learning

Title:Cross-Modal Common Representation Learning with Triplet Loss Functions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Cross-Modal Common Representation Learning with Triplet Loss Functions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators