HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings

Wu, Zhuofeng; Xiao, Chaowei; Vydiswaran, VG Vinod

Computer Science > Computation and Language

arXiv:2310.09720 (cs)

[Submitted on 15 Oct 2023]

Title:HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings

Authors:Zhuofeng Wu, Chaowei Xiao, VG Vinod Vydiswaran

View PDF

Abstract:In this paper, we propose a hierarchical contrastive learning framework, HiCL, which considers local segment-level and global sequence-level relationships to improve training efficiency and effectiveness. Traditional methods typically encode a sequence in its entirety for contrast with others, often neglecting local representation learning, leading to challenges in generalizing to shorter texts. Conversely, HiCL improves its effectiveness by dividing the sequence into several segments and employing both local and global contrastive learning to model segment-level and sequence-level relationships. Further, considering the quadratic time complexity of transformers over input tokens, HiCL boosts training efficiency by first encoding short segments and then aggregating them to obtain the sequence representation. Extensive experiments show that HiCL enhances the prior top-performing SNCSE model across seven extensively evaluated STS tasks, with an average increase of +0.2% observed on BERT-large and +0.44% on RoBERTa-large.

Comments:	In Proceedings of Findings EMNLP 2023
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2310.09720 [cs.CL]
	(or arXiv:2310.09720v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.09720

Submission history

From: Zhuofeng Wu [view email]
[v1] Sun, 15 Oct 2023 03:14:33 UTC (1,207 KB)

Computer Science > Computation and Language

Title:HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators