We present a weakly supervised model that jointly performs both semantic- and instance-segmentation – a particularly relevant problem given the substantial cost of obtaining pixel-perfect annotation for these tasks. In contrast to many popular instance segmentation approaches based on object detectors, our method does not predict any overlapping instances. Moreover, we are able to segment both “thing” and “stuff” classes, and thus explain all the pixels in the image. “Thing” classes are weakly-supervised with bounding boxes, and “stuff” with image-level tags. We obtain state-of-the-art results on Pascal VOC, for both full and weak supervision (which achieves about 95% of fully-supervised performance). Furthermore, we present the first weakly-supervised results on Cityscapes for both semantic- and instance-segmentation. Finally, we use our weakly supervised framework to analyse the relationship between annotation quality and predictive performance, which is of interest to dataset creators.
Below are the results of our panoptic segmentation results on Cityscapes in comparison to other non-overlapping instance segmentation methods. For more details, please refer to the paper.
th.: thing
st.: stuff
all: mean over all thing and stuff classes
Note that due to file size limit set by BaiduYun, some of the larger files had to be split into several chunks in order to be uploaded. These files are named as filename.zip.part##
, where filename
is the original file name excluding the extension, and ##
is a two digit part index. After you have downloaded all the parts, cd
to the folder where they are saved, and use the following command to join them back together:
cat filename.zip.part* > filename.zip
The joining operation may take several minutes, depending on file size.
The above does not apply to files downloaded from Dropbox.
* Equal first authorship