kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
ScaleFusionNet: transformer-guided multi-scale feature fusion for skin lesion segmentation
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. Faculty of Computing and Information Technology (FCIT), Sohar University, 311, Sohar, Oman.
BGI Research, 310030, Hangzhou, China.
Department of Computer Science, College of Computer and Information Technology, Taif University, P. O. Box 11099, 21944, Taif, Kingdom of Saudi Arabia.
School of Computing and Commutations, Lancaster University, LA1 4YW, Lancaster, UK.
Show others and affiliations
2025 (English)In: Scientific Reports, E-ISSN 2045-2322, Vol. 15, no 1, article id 34393Article in journal (Refereed) Published
Abstract [en]

Melanoma is a malignant tumor that originates from skin cell lesions. Accurate and efficient segmentation of skin lesions is essential for quantitative analysis but remains a challenge owing to blurred lesion boundaries, gradual color changes, and irregular shapes. To address this, we propose ScaleFusionNet, a hybrid model that integrates a Cross-Attention Transformer Module (CATM) and adaptive fusion block (AFB) to enhance feature extraction and fusion by capturing both local and global features. We introduce CATM, which utilizes Swin transformer blocks and Cross Attention Fusion (CAF) to adaptively refine feature fusion and reduce semantic gaps in the encoder-decoder to improve segmentation accuracy. Additionally, the AFB uses Swin Transformer-based attention and deformable convolution-based adaptive feature extraction to help the model gather local and global contextual information through parallel pathways. This enhancement refines the lesion boundaries and preserves fine-grained details. ScaleFusionNet achieves Dice scores of 92.94%, 91.80%, and 95.37% on the ISIC-2016, ISIC-2018, and HAM10000 datasets, respectively, demonstrating its effectiveness in skin lesion analysis. Simultaneously, independent validation experiments were conducted on the PH<sup>2</sup> dataset using the pretrained model weights. The results show that ScaleFusionNet demonstrates significant performance improvements compared with other state-of-the-art methods. Our code implementation is publicly available at https://github.com/sqbqamar/ScaleFusionNet.

Place, publisher, year, edition, pages
Springer Nature , 2025. Vol. 15, no 1, article id 34393
Keywords [en]
feature enhancement, Image Segmentation, Information fusion, Skin Lesion, Transformer
National Category
Computer graphics and computer vision Medical Imaging
Identifiers
URN: urn:nbn:se:kth:diva-372357DOI: 10.1038/s41598-025-17300-xISI: 001587011800006PubMedID: 41038982Scopus ID: 2-s2.0-105017626854OAI: oai:DiVA.org:kth-372357DiVA, id: diva2:2011880
Note

QC 20251106

Available from: 2025-11-06 Created: 2025-11-06 Last updated: 2025-11-06Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Qamar, Saqib

Search in DiVA

By author/editor
Qamar, Saqib
By organisation
Robotics, Perception and Learning, RPL
In the same journal
Scientific Reports
Computer graphics and computer visionMedical Imaging

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 26 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf