Portuguese Named Entity Recognition using BERT-CRF

Souza, Fábio; Nogueira, Rodrigo; Lotufo, Roberto

Computer Science > Computation and Language

arXiv:1909.10649 (cs)

[Submitted on 23 Sep 2019 (v1), last revised 27 Feb 2020 (this version, v2)]

Title:Portuguese Named Entity Recognition using BERT-CRF

Authors:Fábio Souza, Rodrigo Nogueira, Roberto Lotufo

View PDF

Abstract:Recent advances in language representation using neural networks have made it viable to transfer the learned internal states of a trained model to downstream natural language processing tasks, such as named entity recognition (NER) and question answering. It has been shown that the leverage of pre-trained language models improves the overall performance on many tasks and is highly beneficial when labeled data is scarce. In this work, we train Portuguese BERT models and employ a BERT-CRF architecture to the NER task on the Portuguese language, combining the transfer capabilities of BERT with the structured predictions of CRF. We explore feature-based and fine-tuning training strategies for the BERT model. Our fine-tuning approach obtains new state-of-the-art results on the HAREM I dataset, improving the F1-score by 1 point on the selective scenario (5 NE classes) and by 4 points on the total scenario (10 NE classes).

Subjects:	Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as:	arXiv:1909.10649 [cs.CL]
	(or arXiv:1909.10649v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1909.10649

Submission history

From: Fábio Souza [view email]
[v1] Mon, 23 Sep 2019 23:21:42 UTC (88 KB)
[v2] Thu, 27 Feb 2020 15:06:49 UTC (314 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-09

Change to browse by:

cs
cs.IR
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Rodrigo Frassetto Nogueira
Roberto de Alencar Lotufo

export BibTeX citation

Computer Science > Computation and Language

Title:Portuguese Named Entity Recognition using BERT-CRF

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Portuguese Named Entity Recognition using BERT-CRF

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators