Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector

Drissi, Mehdi; Sandoval, Pedro; Ojha, Vivaswat; Medero, Julie

doi:10.18653/v1/S19-2165

Computer Science > Computation and Language

arXiv:1905.01962 (cs)

[Submitted on 10 Apr 2019]

Title:Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector

Authors:Mehdi Drissi, Pedro Sandoval, Vivaswat Ojha, Julie Medero

View PDF

Abstract:We investigate the recently developed Bidirectional Encoder Representations from Transformers (BERT) model for the hyperpartisan news detection task. Using a subset of hand-labeled articles from SemEval as a validation set, we test the performance of different parameters for BERT models. We find that accuracy from two different BERT models using different proportions of the articles is consistently high, with our best-performing model on the validation set achieving 85% accuracy and the best-performing model on the test set achieving 77%. We further determined that our model exhibits strong consistency, labeling independent slices of the same article identically. Finally, we find that randomizing the order of word pieces dramatically reduces validation accuracy (to approximately 60%), but that shuffling groups of four or more word pieces maintains an accuracy of about 80%, indicating the model mainly gains value from local context.

Comments:	Submitted to The 13th International Workshop on Semantic Evaluation (SemEval 2019). 5 pages including references
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:1905.01962 [cs.CL]
	(or arXiv:1905.01962v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1905.01962
Related DOI:	https://doi.org/10.18653/v1/S19-2165

Submission history

From: Pedro Sandoval Segura [view email]
[v1] Wed, 10 Apr 2019 17:43:51 UTC (152 KB)

Computer Science > Computation and Language

Title:Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators