Critique Open Research & Review

Open Access Peer Review International
Open Access

Hybrid Attention-Convolution Framework with Shape-Sensitive Optimization for Improved Three-Dimensional Partitioning in Medical and Cellular Imaging

4 Faculty of Interdisciplinary Studies Kyoto International University of Science Kyoto, Japan
4 Center for Applied Multidisciplinary Innovation Osaka Institute of Integrated Technology Osaka, Japan

Abstract

Accurate three-dimensional partitioning of medical and cellular imaging data remains a fundamental challenge in biomedical image analysis due to the complex morphology, multi-scale structures, and high variability present in volumetric datasets. Conventional convolutional neural networks have demonstrated strong performance in segmentation tasks; however, they often struggle to capture long-range dependencies and global contextual relationships required for precise boundary delineation in three-dimensional environments. Transformer-based architectures address global context modeling but frequently introduce high computational complexity and insufficient spatial detail preservation when applied to volumetric data. To overcome these limitations, this study proposes a hybrid attention-convolution framework combined with a shape-sensitive optimization strategy designed to enhance structural consistency and boundary accuracy in three-dimensional segmentation of medical and microscopic images.

The proposed framework integrates convolutional feature extraction with multi-head attention mechanisms to jointly capture local spatial patterns and global contextual dependencies. A multi-branch hybrid encoder is developed to fuse convolutional and transformer-based representations, enabling robust feature learning across multiple scales. In addition, a shape-sensitive loss formulation is introduced to improve segmentation accuracy by enforcing geometric consistency using curvature-aware and distance-based constraints. This optimization strategy allows the model to preserve fine anatomical details and maintain topological correctness, which are critical for applications such as organoid analysis, tumor boundary detection, and volumetric clinical imaging.

The effectiveness of the proposed approach is evaluated through extensive experiments on three-dimensional medical and cellular datasets. Comparative analysis with state-of-the-art architectures, including U-Net variants, transformer-based segmentation models, and multi-aperture fusion networks, demonstrates consistent improvements in segmentation accuracy, boundary preservation, and structural stability. The results indicate that combining attention-driven global modeling with convolutional spatial learning and shape-sensitive optimization provides a balanced and computationally efficient solution for complex volumetric segmentation tasks.

This work contributes a unified segmentation framework that advances current research in medical image analysis by improving three-dimensional partitioning performance while maintaining scalability and robustness across heterogeneous imaging modalities.

Keywords

References

📄 R. Azad et al., “Medical image segmentation review: The success of U-Net,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 10076–10095, Dec. 2024.
📄 N.-T. Bui et al., “SAM3D: Segment anything model in volumetric medical images,” in Proc. IEEE Int. Symp. Biomed. Imag., 2024, pp. 1–4.
📄 J. Chen et al., “TransuNet: Transformers make strong encoders for medical image segmentation,” 2021, arXiv:2102.04306.
📄 J. Chen et al., “TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers,” Med. Image Anal., vol. 97, 2024, Art. no. 103280.
📄 Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, “3D U-Net: Learning dense volumetric segmentation from sparse annotation,” in Proc. Med. Image Comput. Comput.-Assist. Interv.: 19th Int. Conf., 2016, pp. 424–432.
📄 J. R. Clough, N. Byrne, I. Oksuz, V. A. Zimmer, J. A. Schnabel, and A. P. King, “A topological loss function for deep-learning based image segmentation using persistent homology,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 12, pp. 8766–8778, Dec. 2022.
📄 A. Dosovitskiy et al., “An image is worth 16 × 16 words: Transformers for image recognition at scale,” 2020, arXiv:2010.11929.
📄 M.-P. Dubuisson and A. K. Jain, “A modified Hausdorff distance for object matching,” in Proc. 12th Int. Conf. Pattern Recognit., 1994, vol. 1, pp. 566–568.
📄 X. Huang, Z. Deng, D. Li, X. Yuan, and Y. Fu, “MISSFormer: An effective transformer for 2D medical image segmentation,” IEEE Trans. Med. Imag., vol. 42, no. 5, pp. 1484–1494, May 2023.
📄 J. Han et al., “Molecular predictors of 3D morphogenesis by breast cancer cell lines in 3D culture,” PLoS Comput. Biol., vol. 6, no. 2, 2010, Art. no. e1000684.
📄 F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, “nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation,” Nature Methods, vol. 18, no. 2, pp. 203–211, 2021.
📄 D. Karimi and S. E. Salcudean, “Reducing the Hausdorff distance in medical image segmentation with convolutional neural networks,” IEEE Trans. Med. Imag., vol. 39, no. 2, pp. 499–513, Feb. 2020.
📄 A. Kirillov et al., “Segment anything,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2023, pp. 4015–4026.
📄 H. H. Lee, S. Bao, Y. Huo, and B. A. Landman, “3D UX-Net: A large kernel volumetric ConvNet modernizing hierarchical transformer for medical image segmentation,” 2022, arXiv:2209.15076.
📄 J. Ma, Y. He, F. Li, L. Han, C. You, and B. Wang, “Segment anything in medical images,” Nature Commun., vol. 15, no. 1, 2024, Art. no. 654.
📄 J. Ma, F. Li, and B. Wang, “U-Mamba: Enhancing long-range dependency for biomedical image segmentation,” 2024, arXiv:2401.04722.
📄 J. Ma et al., “How distance transform maps boost segmentation CNNs: An empirical study,” in Medical Imaging with Deep Learning. Cambridge, MA, USA : PMLR, 2020, pp. 479–492.
📄 O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. Med. Image Comput. Comput.-Assist. Interv.: 18th Int. Conf., 2015, pp. 234–241.
📄 O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. Med. Image Comput. Comput.-Assist. Interv., 2015, pp. 234–241.
📄 S. Shabani, S. Mohammed, and B. Parvin, “A novel 3D decoder with weighted and learnable triple attention for 3D microscopy image segmentation,” in Proc. Comput. Vis. Pattern Recognit. Conf., 2025, pp. 4699–4708.
📄 S. Shabani, M. Sohaib, S. A. Mohamed, and B. Parvin, “Coupled swin transformers and multi-apertures network (CSTA-NET) improves medical image segmentation,” in Proc. IEEE 22nd Int. Symp. Biomed. Imag., 2025, pp. 1–5.
📄 S. Shabani, M. Sohaib, S. A. Mohammed, and B. Parvin, “Multi-aperture fusion of transformer-convolutional network (MFTC-Net) for 3D medical image segmentation and visualization,” 2024, arXiv:2406.17080.
📄 S. Shit et al., “clDice-a novel topology-preserving loss function for tubular structure segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 16560–16569.
📄 M. Sohaib, S. Shabani, S. A. Mohammed, G. Winkelmaier, and B. Parvin, “Multi-aperture transformers for 3D (MAT3D) segmentation of clinical and microscopic images,” in Proc. Winter Conf. Appl. Comput. Vis., 2025, pp. 4352–4361.
📄 M. Sohaib, S. Shabani, S. A. Mohammed, and B. Parvin, “3D-organoid-SwinNet: High-content profiling of 3D organoids,” IEEE J. Biomed. Health Inform., vol. 29, no. 2, pp. 792–798, Feb. 2025.
📄 V. Srivastava, T. R. Huycke, K. T. Phong, and Z. J. Gartner, “Organoid models for mammary gland dynamics and breast cancer,” Curr. Opin. Cell Biol., vol. 66, pp. 51–58, 2020.
📄 Y. Tang et al., “Self-supervised pre-training of swin transformers for 3D medical image analysis,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 20698–20708.
📄 S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 3–19.
📄 G. Winkelmaier and B. Parvin, “An enhanced loss function simplifies the deep learning model for characterizing the 3D organoid models,” Bioinformatics, vol. 37, no. 18, pp. 3084–3085, 2021.
📄 E. Xie et al., “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in Proc. Adv. Neural Inf. Process. Syst., vol. 34, 2021, pp. 12077–12090.
📄 G. Xu, X. Zhang, X. He, and X. Wu, “LeViT-UNet: Make faster encoders with transformer for medical image segmentation,” in Proc. Chin. Conf. Pattern Recognit. Comput. Vis., 2023, pp. 42–53.
📄 X. Xu, S. Xu, L. Jin, and E. Song, “Characteristic analysis of Otsu threshold and its applications,” Pattern Recognit. Lett., vol. 32, no. 7, pp. 956–961, 2011.
📄 F. Xing and L. Yang, “Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: A comprehensive review,” IEEE Rev. Biomed. Eng., vol. 9, pp. 234–263, 2016.
📄 G. Xing et al., “Multi-scale pathological fluid segmentation in OCT with a novel curvature loss in convolutional neural network,” IEEE Trans. Med. Imag., vol. 41, no. 6, pp. 1547–1559, Jun. 2022.
📄 H.-Y. Zhou, “nnFormer: Volumetric medical image segmentation via a 3D transformer,” IEEE Trans. Image Process., vol. 32, pp. 4036–4045, 2023.

Most read articles by the same author(s)

1 2 3 > >> 

Similar Articles

You may also start an advanced similarity search for this article.