【英文论文】RPFusionNet: An Efficient Semantic Segmentation Method for Large-Scale Remote Sensing Images via Parallel Region–Patch Fusion

时间:2026年03月11日 点击数:

作者:庞世燕,曾维民,石业鹏,左志奇,肖克江*,吴雨俊

出版刊物:Remote Sensing

出版时间:2025年

内容摘要:

Mainstream deep learning segmentation models are designed for small-sized images, and when applied to high-resolution remote sensing images, the limited information contained in small-sized images greatly restricts a model's ability to capture complex con- textual information at a global scale. To mitigate this challenge, we present RPFusionNet, a novel parallel semantic segmentation framework that is specifically designed to effi- ciently integrate both local and global features. RPFusionNet leverages two distinct feature representations: REGION (representing large areas) and PATCH (representing smaller regions). This framework comprises two parallel branches: the REGION branch initially downsamples the entire image,then extracts features via a convolutional neural network (CNN)-based encoder, and subsequently captures multi-level information using pooled kernels of varying sizes. This design enables the model to adapt effectively to objects of different scales. In contrast, the PATCH branch utilizes a pixel-level feature extractor to enrich the high-dimensional features of the local region, thereby enhancing the representa- tion of fine-grained details. To model the semantic correlation between the two branches, we have developed the Region-Patch scale fusion module. This module ensures that the network can comprehend a wider range of image contexts while preserving local details, thus bridging the gap between regional and local information. Extensive experiments were conducted on three public datasets: WBDS, AIDS, and Vaihingen. Compared to other state-of-the-art methods, our network achieved the highest accuracy on all three datasets, with an IoU score of 92.08% on the WBDS dataset, 89.99% on the AIDS dataset, and 88.44% on the Vaihingen dataset.

主流深度学习分割模型主要针对小尺寸图像设计,当应用于高分辨率遥感图像时,小尺寸图像所包含的有限信息会严重制约模型捕捉全局复杂上下文信息的能力。为应对这一挑战,本研究提出RPFusionNet,一种专门设计的并行语义分割框架,能够高效整合局部与全局特征。该框架基于两种不同尺度的特征表征:区域级表征与大范围区域,图块级表征与局部细节区域。框架包含两条并行分支:区域分支首先对整个图像进行下采样,通过基于卷积神经网络的编码器提取特征,并利用多尺度池化核捕获多层次信息,使模型能够有效适应不同尺度的目标;图块分支则采用像素级特征提取器,增强局部区域的高维特征,从而提升细粒度细节的表征能力。为了建模两个分支间的语义关联,本研究设计了区域-图块尺度融合模块,使网络在保持局部细节的同时能够理解更大范围的图像上下文,有效弥合区域信息与局部信息间的差距。在WBDS、AIDS和Vaihingen三个公开数据集上的大量实验表明,相较于其他先进方法,本网络在三个数据集上均取得最高精度,其交并比在WBDS数据集上达92.08%,在AIDS数据集上达89.99%,在Vaihingen数据集上达88.44%。


Copyright© 数字教育湖北省重点实验室(华中师范大学) 地址:湖北 武汉 珞喻路152号 华中师范大学南湖校区南湖综合楼 邮编:430079  联系电话:027-67867090