Deep Learning-Based Object Detection System for Vocal Cords in Laryngoscopy Images

Ying Chang Wu*, Sheng Fu Liang, Cheng Ming Hsu, Ming Chi Cheng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Accurate localization of the vocal folds is critical for diagnostic and therapeutic applications in medical imaging. This paper developed a deep learning-based object detection system to address different tasks related to vocal fold detection. A two-step transfer learning approach was proposed for model training. The YOLOv8 was pre-trained with a large scale of high-speed video recordings available in a public dataset. Then, we fine-tuned the model using a 1:1 combination of the data from the public dataset and the low-frame rate (30 frames/sec) dataset collected from the hospital CGMH in Taiwan to fine-tune the final optimized glottis detection model. The results show that the recall and precision of ROI multiple bounding box predictions are 97.6% and 98.3%, respectively. In comparison, the recall and precision of single bounding box predictions are 97.5% and 100%, respectively. These results demonstrate the successful use of deep learning technology for vocal fold localization with superior performance. Our study provides valuable information for selecting appropriate object detection modalities in medical imaging applications for diagnosis and treatment planning in laryngology and otorhinolaryngology.

Original languageEnglish
Title of host publication2025 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331502669
DOIs
StatePublished - 2025
Externally publishedYes
Event2025 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2025 - Tainan, Taiwan
Duration: 20 08 202522 08 2025

Publication series

Name2025 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2025

Conference

Conference2025 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2025
Country/TerritoryTaiwan
CityTainan
Period20/08/2522/08/25

Bibliographical note

Publisher Copyright:
© 2025 IEEE.

Keywords

  • Deep learning
  • glottis detection
  • ROI detection
  • Videostroboscopy
  • Vocal cords

Fingerprint

Dive into the research topics of 'Deep Learning-Based Object Detection System for Vocal Cords in Laryngoscopy Images'. Together they form a unique fingerprint.

Cite this