Computer Science > Computer Vision and Pattern Recognition
[Submitted on 10 Mar 2016 (v1), last revised 2 May 2017 (this version, v3)]
Title:Exploring Context with Deep Structured models for Semantic Segmentation
View PDFAbstract:State-of-the-art semantic image segmentation methods are mostly based on training deep convolutional neural networks (CNNs). In this work, we proffer to improve semantic segmentation with the use of contextual information. In particular, we explore `patch-patch' context and `patch-background' context in deep CNNs. We formulate deep structured models by combining CNNs and Conditional Random Fields (CRFs) for learning the patch-patch context between image regions. Specifically, we formulate CNN-based pairwise potential functions to capture semantic correlations between neighboring patches. Efficient piecewise training of the proposed deep structured model is then applied in order to avoid repeated expensive CRF inference during the course of back propagation. For capturing the patch-background context, we show that a network design with traditional multi-scale image inputs and sliding pyramid pooling is very effective for improving performance. We perform comprehensive evaluation of the proposed method. We achieve new state-of-the-art performance on a number of challenging semantic segmentation datasets including $NYUDv2$, $PASCAL$-$VOC2012$, $Cityscapes$, $PASCAL$-$Context$, $SUN$-$RGBD$, $SIFT$-$flow$, and $KITTI$ datasets. Particularly, we report an intersection-over-union score of $77.8$ on the $PASCAL$-$VOC2012$ dataset.
Submission history
From: Chunhua Shen [view email][v1] Thu, 10 Mar 2016 08:34:19 UTC (8,561 KB)
[v2] Sat, 26 Mar 2016 12:24:30 UTC (8,561 KB)
[v3] Tue, 2 May 2017 08:06:42 UTC (5,145 KB)
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.