Poster Session 1.I - Theoretical and Translational Medicine
Fazekas, Szuzina
Semmelweis University Medical Imaging Centre
Szuzina Fazekas1, Bettina Katalin Budai2, Zsolt Vizi3
1: Semmelweis University Medical Imaging Centre
2: Heidelberg University, Semmelweis University
3: University of Szeged; Semmelweis University
Introduction: Accurate tissue delineation is critical in radiotherapy planning, as segmentation errors can affect treatment quality and patient outcomes. However, commonly used evaluation metrics such as Dice, Jaccard, and Hausdorff distance mainly quantify geometric agreement and often fail to reflect clinically relevant segmentation errors. Consequently, evaluation approaches that better represent clinical priorities in medical image segmentation are needed.
Aims: This study aimed to develop and demonstrate a clinically informed automated evaluation pipeline for medical image segmentation using a novel metric, the Medical Similarity Index (MSI), designed to better capture clinically meaningful contour deviations compared with traditional metrics.
Method: An automated Python-based evaluation pipeline was implemented that calculates both traditional segmentation metrics and MSI. MSI is based on bidirectional local distance between paired points of reference and test contours and incorporates configurable penalties for inner and outer contour deviations. The method supports multislice images, multiple masks per slice, and automated contour pairing using center-of-mass alignment. The pipeline was evaluated using pelvic MRI datasets, including a fibroid dataset and an open-access prostate MRI dataset. Segmentations were generated using a 2D nnUNet model and compared against expert annotations.
Results: The pipeline successfully evaluated segmentation outputs and demonstrated the adaptability of MSI to different clinical scenarios. Conventional metrics often produced high similarity scores despite clinically unacceptable segmentation errors, whereas MSI provided lower and more realistic scores. For example, a prostate segmentation case showed high Dice (0.94) and Jaccard (0.89) values despite substantial outer contour errors, while MSI correctly indicated poor agreement with a score of approximately 0.40.
Conclusion: The proposed framework provides a flexible and clinically informed approach for assessing medical image segmentation performance. By incorporating customizable penalties for clinically relevant deviations, MSI complements traditional metrics and improves interpretability for radiotherapy and medical imaging research.
Funding: RRF-2.3.1-21-2022-00006; Gedeon Richter Excellence PhD Scholarship.