Semi-supervised Feature Learning For Improving Writer Identification

Chen, Shiming; Wang, Yisong; Lin, Chin-Teng; Ding, Weiping; Cao, Zehong

doi:10.1016/j.ins.2019.01.024

Computer Science > Machine Learning

arXiv:1807.05490 (cs)

[Submitted on 15 Jul 2018 (v1), last revised 6 Oct 2018 (this version, v3)]

Title:Semi-supervised Feature Learning For Improving Writer Identification

Authors:Shiming Chen, Yisong Wang, Chin-Teng Lin, Weiping Ding, Zehong Cao

View PDF

Abstract:Data augmentation is usually used by supervised learning approaches for offline writer identification, but such approaches require extra training data and potentially lead to overfitting errors. In this study, a semi-supervised feature learning pipeline was proposed to improve the performance of writer identification by training with extra unlabeled data and the original labeled data simultaneously. Specifically, we proposed a weighted label smoothing regularization (WLSR) method for data augmentation, which assigned the weighted uniform label distribution to the extra unlabeled data. The WLSR method could regularize the convolutional neural network (CNN) baseline to allow more discriminative features to be learned to represent the properties of different writing styles. The experimental results on well-known benchmark datasets (ICDAR2013 and CVL) showed that our proposed semi-supervised feature learning approach could significantly improve the baseline measurement and perform competitively with existing writer identification approaches. Our findings provide new insights into offline write identification.

Comments:	This manuscript is submitting to Information Science
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1807.05490 [cs.LG]
	(or arXiv:1807.05490v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1807.05490
Journal reference:	Information Sciences (Volume 482, May 2019, Pages 156-170)
Related DOI:	https://doi.org/10.1016/j.ins.2019.01.024

Submission history

From: Zehong Cao Dr. [view email]
[v1] Sun, 15 Jul 2018 05:18:20 UTC (526 KB)
[v2] Wed, 8 Aug 2018 02:08:15 UTC (526 KB)
[v3] Sat, 6 Oct 2018 15:06:38 UTC (6,473 KB)

Computer Science > Machine Learning

Title:Semi-supervised Feature Learning For Improving Writer Identification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Semi-supervised Feature Learning For Improving Writer Identification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators