ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition

Riaz, Haris; Dumitru, Razvan-Gabriel; Surdeanu, Mihai

Computer Science > Computation and Language

arXiv:2403.17385 (cs)

[Submitted on 26 Mar 2024]

Title:ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition

Authors:Haris Riaz, Razvan-Gabriel Dumitru, Mihai Surdeanu

View PDF HTML (experimental)

Abstract:In this work, we revisit the problem of semi-supervised named entity recognition (NER) focusing on extremely light supervision, consisting of a lexicon containing only 10 examples per class. We introduce ELLEN, a simple, fully modular, neuro-symbolic method that blends fine-tuned language models with linguistic rules. These rules include insights such as ''One Sense Per Discourse'', using a Masked Language Model as an unsupervised NER, leveraging part-of-speech tags to identify and eliminate unlabeled entities as false negatives, and other intuitions about classifier confidence scores in local and global context. ELLEN achieves very strong performance on the CoNLL-2003 dataset when using the minimal supervision from the lexicon above. It also outperforms most existing (and considerably more complex) semi-supervised NER methods under the same supervision settings commonly used in the literature (i.e., 5% of the training data). Further, we evaluate our CoNLL-2003 model in a zero-shot scenario on WNUT-17 where we find that it outperforms GPT-3.5 and achieves comparable performance to GPT-4. In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data. Our code is available at: this https URL.

Comments:	Accepted to LREC-COLING 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.17385 [cs.CL]
	(or arXiv:2403.17385v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.17385

Submission history

From: Haris Riaz [view email]
[v1] Tue, 26 Mar 2024 05:11:51 UTC (1,572 KB)

Computer Science > Computation and Language

Title:ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators