An Approach for Weakly-Supervised Deep Information Retrieval

MacAvaney, Sean; Hui, Kai; Yates, Andrew

Computer Science > Information Retrieval

arXiv:1707.00189v1 (cs)

[Submitted on 1 Jul 2017 (this version), latest version 5 Jul 2019 (v3)]

Title:An Approach for Weakly-Supervised Deep Information Retrieval

Authors:Sean MacAvaney, Kai Hui, Andrew Yates

View PDF

Abstract:Recent developments in neural information retrieval models have been promising, but a problem remains: human relevance judgments are expensive to produce, while neural models require a considerable amount of training data. In an attempt to fill this gap, we present an approach for generating weak supervision training data for use in a neural IR model. Specifically, we use a news corpus with article headlines acting as pseudo-queries and article content as pseudo-documents, and we propose a measure of interaction similarity to filter these pseudo-documents. Additionally, we employ techniques for addressing problems related to finding effective negative training examples and disregarding headlines that do not work well as queries. By using our approach to train state-of-the-art neural IR models and comparing to established baselines, we find that training data generated by our approach can lead to good results on a benchmark test collection.

Subjects:	Information Retrieval (cs.IR); Computation and Language (cs.CL)
Cite as:	arXiv:1707.00189 [cs.IR]
	(or arXiv:1707.00189v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.1707.00189

Submission history

From: Andrew Yates [view email]
[v1] Sat, 1 Jul 2017 18:42:29 UTC (51 KB)
[v2] Mon, 24 Jul 2017 12:05:43 UTC (52 KB)
[v3] Fri, 5 Jul 2019 12:00:09 UTC (472 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.IR

< prev | next >

new | recent | 2017-07

Change to browse by:

cs
cs.CL

References & Citations

DBLP - CS Bibliography

listing | bibtex

Sean MacAvaney
Kai Hui
Andrew Yates

export BibTeX citation

Computer Science > Information Retrieval

Title:An Approach for Weakly-Supervised Deep Information Retrieval

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:An Approach for Weakly-Supervised Deep Information Retrieval

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators