...
首页> 外文期刊>Multimedia Tools and Applications >Learning laparoscopic video shot classification for gynecological surgery
【24h】

Learning laparoscopic video shot classification for gynecological surgery

机译:学习用于妇科手术的腹腔镜视频镜头分类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Videos of endoscopic surgery are used for education of medical experts, analysis in medical research, and documentation for everyday clinical life. Hand-crafted image descriptors lack the capabilities of a semantic classification of surgical actions and video shots of anatomical structures. In this work, we investigate how well single-frame convolutional neural networks (CNN) for semantic shot classification in gynecologic surgery work. Together with medical experts, we manually annotate hours of raw endoscopic gynecologic surgery videos showing endometriosis treatment and myoma resection of over 100 patients. The cleaned ground truth dataset comprises 9 h of annotated video material (from 111 different recordings). We use the well-known CNN architectures AlexNet and GoogLeNet and train these architectures for both, surgical actions and anatomy, from scratch. Furthermore, we extract high-level features from AlexNet with weights from a pre-trained model from the Caffe model zoo and feed them to an SVM classifier. Our evaluation shows that we reach an average recall of .697 and .515 for classification of anatomical structures and surgical actions respectively using off-the-shelf CNN features. Using GoogLeNet, we achieve a mean recall of .782 and .617 for classification of anatomical structures and surgical actions respectively. With AlexNet the achieved recall is .615 for anatomical structures and .469 for surgical action classification respectively. The main conclusion of our work is that advances in general image classification methods transfer to the domain of endoscopic surgery videos in gynecology. This is relevant as this domain is different from natural images, e.g. it is distinguished by smoke, reflections, or a limited amount of colors.
机译:内窥镜手术视频用于医学专家的教育,医学研究分析和日常临床生活的文档。手工制作的图像描述符缺乏对外科手术动作和解剖结构的视频镜头进行语义分类的功能。在这项工作中,我们调查妇科手术工作中单帧卷积神经网络(CNN)在语义镜头分类方面的表现。我们与医学专家一起手动注释了几小时原始的内窥镜妇科手术视频,这些视频显示了100多名患者的子宫内膜异位症治疗和肌瘤切除术。清理后的地面事实数据集包含9小时的带注释的视频材料(来自111个不同的记录)。我们使用著名的CNN架构AlexNet和GoogLeNet,并从头开始对这些架构进行外科手术和解剖学训练。此外,我们从AlexNet提取高级特征,并从Caffe模型动物园的预训练模型中提取权重,并将其输入SVM分类器。我们的评估表明,使用现成的CNN功能分别对解剖结构和手术动作进行分类时,平均召回率为.697和.515。使用GoogLeNet,我们分别对解剖结构和手术动作进行分类的平均召回率为.782和.617。使用AlexNet,解剖结构的召回率为.615,外科手术分类的召回率为.469。我们工作的主要结论是,常规图像分类方法的进展已转移到妇科内窥镜手术视频领域。这是相关的,因为此域与自然图像不同,例如它的特点是冒烟,反射或颜色有限。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号