测绘学报 ›› 2019, Vol. 48 ›› Issue (6): 718-726.doi: 10.11947/j.AGCS.2019.20170740

• 摄影测量学与遥感 • 上一篇    下一篇

融合可变形卷积与条件随机场的遥感影像语义分割方法

左宗成1,2, 张文1, 张东映3   

  1. 1. 武汉大学遥感信息工程学院, 湖北 武汉 430079;
    2. 欧特克(中国)软件研发有限公司, 上海 200122;
    3. 郑州大学水利与环境学院, 河南 郑州 450002
  • 收稿日期:2017-12-23 修回日期:2018-08-01 出版日期:2019-06-20 发布日期:2019-07-09
  • 通讯作者: 张文 E-mail:wen_zhang@whu.edu.cn
  • 作者简介:左宗成(1988-),男,硕士,工程师,研究方向为高分辨遥感影像处理及信息提取、模式识别与机器学习。E-mail:jason.zuo@autodesk.com
  • 基金资助:
    国家重点研发计划(2017YFC0405806)

A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields

ZUO Zongcheng1,2, ZHANG Wen1, ZHANG Dongying3   

  1. 1. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China;
    2. Autodesk(China) Software Research and Development Co. Ltd., Shanghai 200122, China;
    3. College of Water Conservancy & Environmental Engineering, Zhengzhou University, Zhengzhou 450002, China
  • Received:2017-12-23 Revised:2018-08-01 Online:2019-06-20 Published:2019-07-09
  • Supported by:
    The National Key Research and Development Program of China (No.2017YFC0405806)

摘要: 当前,深度卷积神经网络在遥感影像语义分割领域取得了长足的发展。标准的卷积神经网络由于卷积核的几何形状是固定的,导致对几何变换的模拟能力受到限制。本文引入一种可变形卷积来增强卷积网络对空间变换的适应能力。由于神经网络架构中使用了池化层操作,这会导致在输出层未能充分地对局部对象进行准确的分割。为了克服这种特性,本文将神经网络输出层的粗糙预测分割结果通过全连接的条件随机场来进行处理,以此来提高对影像细节的分割能力。本文方法易于采用标准的反向传播算法进行端到端的方式训练。ISPRS数据集上的测试试验结果表明本文方法可以有效地克服遥感影像中分割对象的复杂结构对分割结果的影响,并在该数据集上获得了当前最好的语义分割结果。

关键词: 高分辨率遥感影像, 语义分割, 可变形卷积网络, 条件随机场

Abstract: Currently, deep convolutional neural networks have made great progress in the field of semantic segmentation. Because of the fixed convolution kernel geometry, standard convolution neural networks have been limited the ability to simulate geometric transformations. Therefore, a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation. Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural networks architecture. To overcome this shortcoming, the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation. The proposed method can easily be trained by end-to-end using standard backpropagation algorithms. Finally, the proposed method is tested on the ISPRS dataset. The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.

Key words: high-resolution remote sensing image, semantic segmentation, deformable convolution network, conditions random fields

中图分类号: