Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies

Matkin, Nikita; Smirnov, Aleksei; Usanin, Mikhail; Ivanov, Egor; Sobyanin, Kirill; Paklina, Sofiia; Parshakov, Petr

Computer Science > Computation and Language

arXiv:2407.19816 (cs)

[Submitted on 29 Jul 2024 (v1), last revised 15 Sep 2024 (this version, v2)]

Title:Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies

Authors:Nikita Matkin, Aleksei Smirnov, Mikhail Usanin, Egor Ivanov, Kirill Sobyanin, Sofiia Paklina, Petr Parshakov

View PDF HTML (experimental)

Abstract:The labor market is undergoing rapid changes, with increasing demands on job seekers and a surge in job openings. Identifying essential skills and competencies from job descriptions is challenging due to varying employer requirements and the omission of key skills. This study addresses these challenges by comparing traditional Named Entity Recognition (NER) methods based on encoders with Large Language Models (LLMs) for extracting skills from Russian job vacancies. Using a labeled dataset of 4,000 job vacancies for training and 1,472 for testing, the performance of both approaches is evaluated. Results indicate that traditional NER models, especially DeepPavlov RuBERT NER tuned, outperform LLMs across various metrics including accuracy, precision, recall, and inference time. The findings suggest that traditional NER models provide more effective and efficient solutions for skill extraction, enhancing job requirement clarity and aiding job seekers in aligning their qualifications with employer expectations. This research contributes to the field of natural language processing (NLP) and its application in the labor market, particularly in non-English contexts.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2407.19816 [cs.CL]
	(or arXiv:2407.19816v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.19816

Submission history

From: Petr Parshakov [view email]
[v1] Mon, 29 Jul 2024 09:08:40 UTC (278 KB)
[v2] Sun, 15 Sep 2024 05:02:37 UTC (8,118 KB)

Computer Science > Computation and Language

Title:Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators