Statistics > Machine Learning
[Submitted on 1 Oct 2013 (this version), latest version 4 Feb 2014 (v2)]
Title:Exact Block-Constant Rating Matrix Recovery from a Few Noisy Observations
View PDFAbstract:Recommender systems predict user preferences based on a small number of observed, possibly noisy ratings. To allow accurate predictions, one common assumption is that the rating matrix has low-rank. This paper considers a more structured movie rating model in which (1) users and movies form clusters, (2) users from the same cluster give the same rating to movies in the same cluster, and (3) the ratings are either +1 or -1. The corresponding rating matrix is a block-constant matrix with binary entries, which is a special type of low-rank \ matrix.
Consider a system with $n$ users and $n$ movies and $r$ user clusters and $r$ movie clusters of equal sizes, and assume that we observe $m$ ratings. In the ideal case where the observations are noiseless, predicting the ratings reduces to clustering the users and movies, and we show that a simple algorithm based on finding the maximum clique succeeds as soon as $m=\Omega(n r^{1/2} \log^{1/2} n)$. This is fewer than the number of observations required if we only make a low-rank assumption. For the more general noisy setting, we propose a convex program to recover the rating matrix: among matrices with entries in the range $[-1,1]$, it maximizes a weighted sum of the correlation with observed ratings and the nuclear norm. This convex program is provably correct when $m=\Omega(nr^2)$, but we conjecture that $m=\Omega(nr \log n)$ is sufficient. Again, our block-constant and binary assumptions allow us to exactly recover the matrix with fewer observations, and a larger fraction of noisy entries. Additionally, our analysis is novel and considerably simpler than previous works on low-rank matrix completion.
Submission history
From: Rui Wu [view email][v1] Tue, 1 Oct 2013 22:46:06 UTC (17 KB)
[v2] Tue, 4 Feb 2014 21:59:39 UTC (45 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.