Enhancing Talent Employment Insights Through Feature Extraction with LLM Finetuning

Thakrar, Karishma; Young, Nick

Computer Science > Computation and Language

arXiv:2501.07663 (cs)

[Submitted on 13 Jan 2025]

Title:Enhancing Talent Employment Insights Through Feature Extraction with LLM Finetuning

Authors:Karishma Thakrar, Nick Young

View PDF HTML (experimental)

Abstract:This paper explores the application of large language models (LLMs) to extract nuanced and complex job features from unstructured job postings. Using a dataset of 1.2 million job postings provided by AdeptID, we developed a robust pipeline to identify and classify variables such as remote work availability, remuneration structures, educational requirements, and work experience preferences. Our methodology combines semantic chunking, retrieval-augmented generation (RAG), and fine-tuning DistilBERT models to overcome the limitations of traditional parsing tools. By leveraging these techniques, we achieved significant improvements in identifying variables often mislabeled or overlooked, such as non-salary-based compensation and inferred remote work categories. We present a comprehensive evaluation of our fine-tuned models and analyze their strengths, limitations, and potential for scaling. This work highlights the promise of LLMs in labor market analytics, providing a foundation for more accurate and actionable insights into job data.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2501.07663 [cs.CL]
	(or arXiv:2501.07663v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.07663

Submission history

From: Karishma Thakrar [view email]
[v1] Mon, 13 Jan 2025 19:49:49 UTC (97 KB)

Computer Science > Computation and Language

Title:Enhancing Talent Employment Insights Through Feature Extraction with LLM Finetuning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Enhancing Talent Employment Insights Through Feature Extraction with LLM Finetuning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators