Predicting respondent difficulty in web surveys: A machine-learning approach based on mouse movement features

Fernández-Fontelo, Amanda; Kieslich, Pascal J.; Henninger, Felix; Kreuter, Frauke; Greven, Sonja

Abstract:A central goal of survey research is to collect robust and reliable data from respondents. However, despite researchers' best efforts in designing questionnaires, respondents may experience difficulty understanding questions' intent and therefore may struggle to respond appropriately. If it were possible to detect such difficulty, this knowledge could be used to inform real-time interventions through responsive questionnaire design, or to indicate and correct measurement error after the fact. Previous research in the context of web surveys has used paradata, specifically response times, to detect difficulties and to help improve user experience and data quality. However, richer data sources are now available, in the form of the movements respondents make with the mouse, as an additional and far more detailed indicator for the respondent-survey interaction. This paper uses machine learning techniques to explore the predictive value of mouse-tracking data with regard to respondents' difficulty. We use data from a survey on respondents' employment history and demographic information, in which we experimentally manipulate the difficulty of several questions. Using features derived from the cursor movements, we predict whether respondents answered the easy or difficult version of a question, using and comparing several state-of-the-art supervised learning methods. In addition, we develop a personalization method that adjusts for respondents' baseline mouse behavior and evaluate its performance. For all three manipulated survey questions, we find that including the full set of mouse movement features improved prediction performance over response-time-only models in nested cross-validation. Accounting for individual differences in mouse movements led to further improvements.

Comments:	40 pages, 2 Figures, 3 Tables
Subjects:	Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Applications (stat.AP)
Cite as:	arXiv:2011.06916 [cs.HC]
	(or arXiv:2011.06916v1 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2011.06916

Computer Science > Human-Computer Interaction

Title:Predicting respondent difficulty in web surveys: A machine-learning approach based on mouse movement features

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators