Code-Mixed Telugu-English Hate Speech Detection

Kakarla, Santhosh; Venkata, Gautama Shastry Bulusu

Abstract:Hate speech detection in low-resource languages like Telugu is a growing challenge in NLP. This study investigates transformer-based models, including TeluguHateBERT, HateBERT, DeBERTa, Muril, IndicBERT, Roberta, and Hindi-Abusive-MuRIL, for classifying hate speech in Telugu. We fine-tune these models using Low-Rank Adaptation (LoRA) to optimize efficiency and performance. Additionally, we explore a multilingual approach by translating Telugu text into English using Google Translate to assess its impact on classification accuracy.
Our experiments reveal that most models show improved performance after translation, with DeBERTa and Hindi-Abusive-MuRIL achieving higher accuracy and F1 scores compared to training directly on Telugu text. Notably, Hindi-Abusive-MuRIL outperforms all other models in both the original Telugu dataset and the translated dataset, demonstrating its robustness across different linguistic settings. This suggests that translation enables models to leverage richer linguistic features available in English, leading to improved classification performance. The results indicate that multilingual processing can be an effective approach for hate speech detection in low-resource languages. These findings demonstrate that transformer models, when fine-tuned appropriately, can significantly improve hate speech detection in Telugu, paving the way for more robust multilingual NLP applications.

Comments:	4 pages, 1 figure, 2 tables
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2502.10632 [cs.CL]
	(or arXiv:2502.10632v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.10632

Computer Science > Computation and Language

Title:Code-Mixed Telugu-English Hate Speech Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators