首页> 外文会议>International Joint Conference on Neural Networks >DNN-based Acoustic-to-Articulatory Inversion using Ultrasound Tongue Imaging

【24h】

DNN-based Acoustic-to-Articulatory Inversion using Ultrasound Tongue Imaging

机译：基于DNN的舌音成像技术的声转发音反演

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech sounds are produced as the coordinated movement of the speaking organs. There are several available methods to model the relation of articulatory movements and the resulting speech signal. The reverse problem is often called as acoustic-to-articulatory inversion (AAI). In this paper we have implemented several different Deep Neural Networks (DNNs) to estimate the articulatory information from the acoustic signal. There are several previous works related to performing this task, but most of them are using ElectroMagnetic Articulography (EMA) for tracking the articulatory movement. Compared to EMA, Ultrasound Tongue Imaging (UTI) is a technique of higher cost-benefit if we take into account equipment cost, portability, safety and visualized structures. Seeing that, our goal is to train a DNN to obtain UT images, when using speech as input. We also test two approaches to represent the articulatory information: 1) the EigenTongue space and 2) the raw ultrasound image. As an objective quality measure for the reconstructed UT images, we use MSE, Structural Similarity Index (SSIM) and Complex- Wavelet SSIM (CW-SSIM). Our experimental results show that CW-SSIM is the most useful error measure in the UTI context. We tested three different system configurations: a) simple DNN composed of 2 hidden layers with 64x64 pixels of an UTI file as target; b) the same simple DNN but with ultrasound images projected to the EigenTongue space as the target; c) and a more complex DNN composed of 5 hidden layers with UTI files projected to the EigenTongue space. In a subjective experiment the subjects found that the neural networks with two hidden layers were more suitable for this inversion task.

机译：语音是作为各说话器官的协调运动而产生的。有几种可用的方法来模拟发音运动和所产生的语音信号之间的关系。反向问题通常称为声学-发音反演（AAI）。在本文中，我们已经实现了几种不同的深度神经网络（DNN），以从声信号中估计发音信息。以前有几项与执行此任务有关的工作，但其中大多数都是使用电磁关节运动（EMA）来跟踪关节运动。与EMA相比，如果我们将设备成本，便携性，安全性和可视化结构考虑在内，那么超声舌成像（UTI）就是一种具有更高成本效益的技术。可见，我们的目标是在使用语音作为输入时训练DNN以获取UT图像。我们还测试了两种表示发音信息的方法：1）EigenTongue空间和2）原始超声图像。作为重建UT图像的客观质量度量，我们使用MSE，结构相似性指数（SSIM）和复数小波SSIM（CW-SSIM）。我们的实验结果表明，CW-SSIM是UTI上下文中最有用的错误度量。我们测试了三种不同的系统配置：a）由2个隐藏层组成的简单DNN，以UTI文件的64x64像素为目标; b）相同的简单DNN，但将超声图像投影到本征舌空间作为目标; c）以及由5个隐藏层组成的更复杂的DNN，其中UTI文件投影到EigenTongue空间。在主观实验中，受试者发现具有两个隐藏层的神经网络更适合此反演任务。

著录项

来源
《International Joint Conference on Neural Networks 》|2019年|1-8|共8页
会议地点
作者
Dagoberto Porras; Alexander Sepúlveda-Sepúlveda; Tamás Gábor Csapó;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Ultrasonic imaging; Tongue; Acoustics; Neural networks; Ultrasonic variables measurement; Imaging; Indexes;

机译：超声成像;舌;声学;神经网络;超声变量测量;成像;指标;

相似文献

外文文献
中文文献
专利

1. Quantitative three-dimensional ultrasound imaging of partially resected tongues. [J] . Bressmann T, Ackloo E, Heng CL, Otolaryngology--head and neck surgery: official journal of American Academy of Otolaryngology-Head and Neck Surgery . 2007 ,第5期

机译：局部切除的舌头的三维三维超声成像。
2. Fully-automated tongue detection in ultrasound images [J] . Karimi Elham, Menard Lucie, Laporte Catherine Computers in Biology and Medicine . 2019 ,第期

机译：超声图像中的完全自动舌头检测
3. Native Language Influence on Brass Instrument Performance: An Application of Generalized Additive Mixed Models (GAMMs) to Midsagittal Ultrasound Images of the Tongue [J] . Heyne Matthias, Derrick Donald, Al-Tamimi Jalal Frontiers in Psychology . 2019 ,第2期

机译：母语对黄铜仪器性能的影响：将广义添加剂混合模型（GAMMS）应用于舌头的中间超声图像
4. DNN-based Acoustic-to-Articulatory Inversion using Ultrasound Tongue Imaging [C] . Dagoberto Porras, Alexander Sepúlveda-Sepúlveda, Tamás Gábor Csapó International Joint Conference on Neural Networks . 2019

机译：基于DNN的声学对剖反使用超声舌头成像
5. Reconstructing 3-D tongue motion from 2-D ultrasound images and speech signals. [D] . Sze, Cheng-Feng. 2000

机译：从2D超声图像和语音信号重建3D舌头运动。
6. Echo-enhanced ultrasound with pulse inversion imaging: A new imaging modality for the differentiation of cystic pancreatic tumours [O] . Steffen Rickes, Klaus Mönkemüller, Peter Malfertheiner 2006

机译：回波增强超声与脉冲倒置成像：一种新的成像方式用于分化为囊性胰腺肿瘤
7. Comparing articulatory images: An MRI / Ultrasound Tongue Image database [O] . Cleland Joanne, Wrench Alan A, Scobbie James M, 2011

机译：比较关节图像：MRI /超声舌图像数据库

DNN-based Acoustic-to-Articulatory Inversion using Ultrasound Tongue Imaging

摘要

著录项

相似文献

相关主题

期刊订阅