Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study

Xu, Xin; Chen, Xiang; Zhang, Ningyu; Xie, Xin; Chen, Xi; Chen, Huajun

Computer Science > Computation and Language

arXiv:2210.10678 (cs)

[Submitted on 19 Oct 2022 (v1), last revised 18 Sep 2023 (this version, v3)]

Title:Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study

Authors:Xin Xu, Xiang Chen, Ningyu Zhang, Xin Xie, Xi Chen, Huajun Chen

View PDF

Abstract:This paper presents an empirical study to build relation extraction systems in low-resource settings. Based upon recent pre-trained language models, we comprehensively investigate three schemes to evaluate the performance in low-resource settings: (i) different types of prompt-based methods with few-shot labeled data; (ii) diverse balancing methods to address the long-tailed distribution issue; (iii) data augmentation technologies and self-training to generate more labeled in-domain data. We create a benchmark with 8 relation extraction (RE) datasets covering different languages, domains and contexts and perform extensive comparisons over the proposed schemes with combinations. Our experiments illustrate: (i) Though prompt-based tuning is beneficial in low-resource RE, there is still much potential for improvement, especially in extracting relations from cross-sentence contexts with multiple relational triples; (ii) Balancing methods are not always helpful for RE with long-tailed distribution; (iii) Data augmentation complements existing baselines and can bring much performance gain, while self-training may not consistently achieve advancement to low-resource RE. Code and datasets are in this https URL.

Comments:	Accepted to EMNLP 2022 (Findings) and the project website is this https URL
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as:	arXiv:2210.10678 [cs.CL]
	(or arXiv:2210.10678v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.10678

Submission history

From: Ningyu Zhang [view email]
[v1] Wed, 19 Oct 2022 15:46:37 UTC (570 KB)
[v2] Thu, 3 Nov 2022 06:02:42 UTC (570 KB)
[v3] Mon, 18 Sep 2023 11:16:48 UTC (570 KB)

Computer Science > Computation and Language

Title:Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators