首页> 外文会议>IEEE International Conference on Robotics and Automation >Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments
【24h】

Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments

机译:注意力导向的轻型网络,用于机器人手术器械的实时细分

获取原文

摘要

The real-time segmentation of surgical instruments plays a crucial role in robot-assisted surgery. However, it is still a challenging task to implement deep learning models to do real-time segmentation for surgical instruments due to their high computational costs and slow inference speed. In this paper, we propose an attention-guided lightweight network (LWANet), which can segment surgical instruments in real-time. LWANet adopts encoder-decoder architecture, where the encoder is the lightweight network MobileNetV2, and the decoder consists of depthwise separable convolution, attention fusion block, and transposed convolution. Depthwise separable convolution is used as the basic unit to construct the decoder, which can reduce the model size and computational costs. Attention fusion block captures global contexts and encodes semantic dependencies between channels to emphasize target regions, contributing to locating the surgical instrument. Transposed convolution is performed to upsample feature maps for acquiring refined edges. LWANet can segment surgical instruments in real-time while takes little computational costs. Based on 960x544 inputs, its inference speed can reach 39 fps with only 3.39 GFLOPs. Also, it has a small model size and the number of parameters is only 2.06 M. The proposed network is evaluated on two datasets. It achieves state-of-the- art performance 94.10% mean IOU on Cata7 and obtains a new record on EndoVis 2017 with a 4.10% increase on mean IOU.
机译:手术器械的实时分割在机器人辅助手术中起着至关重要的作用。但是,由于深度学习模型的高计算成本和缓慢的推理速度,因此实施深度学习模型对手术器械进行实时分割仍然是一项艰巨的任务。在本文中,我们提出了一种注意力导向的轻量级网络(LWANet),该网络可以实时分割手术器械。 LWANet采用编码器-解码器体系结构,其中编码器是轻量级网络MobileNetV2,解码器由深度可分离卷积,注意融合块和转置卷积组成。深度可分离卷积被用作构造解码器的基本单元,这可以减小模型大小和计算成本。注意融合块捕获全局上下文并编码通道之间的语义相关性以强调目标区域,从而有助于定位手术器械。执行转置卷积以对特征图进行上采样,以获取精炼的边缘。 LWANet可以实时分割手术器械,而所需的计算成本却很少。基于960x544输入,仅3.39 GFLOP时其推理速度就可以达到39 fps。而且,它具有较小的模型大小,参数数量仅为2.06M。在两个数据集上对所提出的网络进行了评估。它在Cata7上实现了94.10%的平均IOU的最新性能,并在EndoVis 2017上创下了新记录,平均IOU增加了4.10%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号