Interpretable Low-Rank Document Representations with Label-Dependent Sparsity Patterns

Ivek, Ivan

Computer Science > Computation and Language

arXiv:1407.6872 (cs)

[Submitted on 25 Jul 2014]

Title:Interpretable Low-Rank Document Representations with Label-Dependent Sparsity Patterns

Authors:Ivan Ivek

View PDF

Abstract:In context of document classification, where in a corpus of documents their label tags are readily known, an opportunity lies in utilizing label information to learn document representation spaces with better discriminative properties. To this end, in this paper application of a Variational Bayesian Supervised Nonnegative Matrix Factorization (supervised vbNMF) with label-driven sparsity structure of coefficients is proposed for learning of discriminative nonsubtractive latent semantic components occuring in TF-IDF document representations. Constraints are such that the components pursued are made to be frequently occuring in a small set of labels only, making it possible to yield document representations with distinctive label-specific sparse activation patterns. A simple measure of quality of this kind of sparsity structure, dubbed inter-label sparsity, is introduced and experimentally brought into tight connection with classification performance. Representing a great practical convenience, inter-label sparsity is shown to be easily controlled in supervised vbNMF by a single parameter.

Subjects:	Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as:	arXiv:1407.6872 [cs.CL]
	(or arXiv:1407.6872v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1407.6872

Submission history

From: Ivan Ivek [view email]
[v1] Fri, 25 Jul 2014 12:46:18 UTC (171 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.IR

< prev | next >

new | recent | 2014-07

Change to browse by:

cs
cs.CL
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ivan Ivek

export BibTeX citation

Computer Science > Computation and Language

Title:Interpretable Low-Rank Document Representations with Label-Dependent Sparsity Patterns

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Interpretable Low-Rank Document Representations with Label-Dependent Sparsity Patterns

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators