We present an end-to-end network to bridge the gap between training and inference pipeline for panoptic segmentation. In contrast to recent works, our network exploits a parametrised, yet lightweight panoptic segmentation submodule, powered by an end-to-end learnt dense instance affinity, to capture the probability that any pair of pixels belong to the same instance. Reaping the benefits of end-to-end training, our system sets new records on the Cityscapes and COCO datasets, achieving 61.4 PQ and 43.4 PQ on the respective validation sets with just a ResNet-50 backbone.
In CVPR,
2020