Hostile Counterspeech Drives Users From Hate Subreddits

Hickey, Daniel; Schmitz, Matheus; Fessler, Daniel M. T.; Smaldino, Paul E.; Lerman, Kristina; Murić, Goran; Burghardt, Keith

Abstract:Counterspeech -- speech that opposes hate speech -- has gained significant attention recently as a strategy to reduce hate on social media. While previous studies suggest that counterspeech can somewhat reduce hate speech, little is known about its effects on participation in online hate communities, nor which counterspeech tactics reduce harmful behavior. We begin to address these gaps by identifying 25 large hate communities ("subreddits") within Reddit and analyzing the effect of counterspeech on newcomers within these communities. We first construct a new public dataset of carefully annotated counterspeech and non-counterspeech comments within these subreddits. We use this dataset to train a state-of-the-art counterspeech detection model. Next, we use matching to evaluate the causal effects of hostile and non-hostile counterspeech on the engagement of newcomers in hate subreddits. We find that, while non-hostile counterspeech is ineffective at keeping users from fully disengaging from these hate subreddits, a single hostile counterspeech comment substantially reduces both future likelihood of engagement. While offering nuance to the understanding of counterspeech efficacy, these results a) leave unanswered the question of whether hostile counterspeech dissuades newcomers from participation in online hate writ large, or merely drives them into less-moderated and more extreme hate communities, and b) raises ethical considerations about hostile counterspeech, which is both comparatively common and might exacerbate rather than mitigate the net level of antagonism in society. These findings underscore the importance of future work to improve counterspeech tactics and minimize unintended harm.

Comments:	19 pages, 11 figures. arXiv admin note: text overlap with arXiv:2303.13641
Subjects:	Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2405.18374 [cs.CY]
	(or arXiv:2405.18374v1 [cs.CY] for this version)
	https://doi.org/10.48550/arXiv.2405.18374

Computer Science > Computers and Society

Title:Hostile Counterspeech Drives Users From Hate Subreddits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators