UNet with self-adaptive Mamba-like attention and causal-resonance learning for medical image segmentationShow others and affiliations
2026 (English)In: Scientific Reports, E-ISSN 2045-2322, Vol. 16, no 1, article id 135
Article in journal (Refereed) Published
Abstract [en]
Medical image segmentation plays an important role in various clinical applications, but existing deep learning models face trade-offs between efficiency and accuracy. Convolutional Neural Networks (CNNs) capture local details well but miss the global context, whereas transformers handle the global context but at a high computational cost. Recently, State Space Sequence Models (SSMs) have shown potential for capturing long-range dependencies with linear complexity, but their direct use in medical image segmentation remains limited due to incompatibility with image structures and autoregressive assumptions. To overcome these challenges, we propose SAMA-UNet, a novel U-shaped architecture that introduces two key innovations. First, the Self-Adaptive Mamba-like Aggregated Attention (SAMA) block adaptively integrates local and global features through dynamic attention weighting, enabling an efficient representation of complex anatomical patterns. Second, the causal resonance multi-scale module (CR-MSM) improves encoder–decoder interactions by adjusting feature resolution and causal dependencies across scales, enhancing the semantic alignment between low- and high-level features. Extensive experiments on MRI, CT, and endoscopy datasets demonstrate that SAMA-UNet consistently outperforms CNN, Transformer, and Mamba-based methods. It achieves 85.38% DSC and 87.82% NSD on BTCV, 92.16% and 96.54% on ACDC, 67.14% and 68.70% on EndoVis17, and 84.06% and 88.47% on ATLAS23, establishing new benchmarks across modalities. These results confirm the effectiveness of SAMA-UNet in combining efficiency with accuracy, making it a promising solution for real-world clinical segmentation tasks. The source code is available on https://github.com/sqbqamar/SAMA-UNet.
Place, publisher, year, edition, pages
Springer Nature , 2026. Vol. 16, no 1, article id 135
Keywords [en]
Medical image segmentation, Multi-scale feature learning, Transformer, Vision state space models
National Category
Computer graphics and computer vision Medical Imaging
Identifiers
URN: urn:nbn:se:kth:diva-375743DOI: 10.1038/s41598-025-28885-8ISI: 001653310900002PubMedID: 41339647Scopus ID: 2-s2.0-105026391874OAI: oai:DiVA.org:kth-375743DiVA, id: diva2:2031101
Note
QC 20260122
2026-01-222026-01-222026-01-22Bibliographically approved