Uncertainty-Aware Self-Supervised Cross-Modal SAR-Optical Matching Using EfficientDet and Xception
Abstract
Cross-modal matching of Synthetic Aperture Radar (SAR) and optical satellite imagery is challenging due to their distinct imaging characteristics. We propose a deep learning framework integrating a dual encoder architecture, self-supervised contrastive learning, and uncertainty quantification for robust SAR-optical matching. The framework employs modality-specific encoders (EfficientDet for optical, Xception for SAR) with uncertainty modules capturing aleatoric and epistemic uncertainties, enhanced by self-supervised contrastive and rotation prediction tasks. Evaluated on the SEN12MS dataset, our method achieves a Maximum Mean Accuracy (MMA) of 0.145 at a 1-pixel threshold and 1298.3 average matched pairs per image (aNM), improving MMA by 20.8% over the state-of-the-art transformer-based method. Our uncertainty quantification yields an Expected Calibration Error (ECE) of 0.09, ensuring reliable confidence estimates. Ablation studies confirm the efficacy of our components, with computational efficiency improved by 40% faster convergence during supervised fine-tuning due to self-supervised pre-training. The method excels across diverse scenarios, including seasonal changes and varied land cover types, advancing SAR-optical matching for applications like change detection and disaster response.
Full Text:
PDFDOI: https://doi.org/10.31449/inf.v49i25.8421

This work is licensed under a Creative Commons Attribution 3.0 License.