...
首页> 外文期刊>PLoS One >Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network
【24h】

Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network

机译:使用深卷积LSTM网络全自动地分割内镜喉部高速视频的光泽和声带

获取原文
           

摘要

The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future.
机译:物流折叠振动动态特性的客观调查要求喉头高速视频(HSV)的记录和进一步定量分析。声音折叠振动模式的定量需要作为第一步骤,每个录像帧内的光泽区域的分割通常导出声带的振动边缘。因此,任何进一步的振动分析的结果取决于该初始分割过程的质量。在这项工作中,我们第一次提出了一种过程,不仅可以完全自动分割,而且不仅使用深卷积神经网络(CNN)方法直接从喉部高速视频(HSV)直接来自喉部高速视频(HSV)的声音折叠组织。培训并在从56个健康和74个病理受试者获得的完全13,000个高速视频(HSV)帧上进行培训和评估18个不同的卷积神经网络(CNN)网络配置。在包括每个连续图像的15个测试视频序列上,在包括每个连续图像的15个测试视频序列中,使用长短期存储器(LSTM)单元进行时间内容的最佳卷积神经网络(CNN)模型的分割质量。随着性能测量骰子系数(DC)以及使用四个解剖标记位置的精确。在所有测试数据中,分别为左右声带(VF)的光泽和0.91和0.90获得0.85的平均骰子系数(DC)。所识别的地标量的宏观平均精度为2.2像素,与可比手动专家分段相同,可以被视为金标准。这里所提出的方法不需要用户交互,并克服当前半自动或计算昂贵方法的局限性。因此,它还允许分析长高速视频(HSV)序列,并持有促进临床常规中声带振动的客观分析。这里使用包括地面真理的使用数据集将自由为所有科学群体提供,以允许将来进行分割方法的定量基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号