Abstracts of Papers in Acoustical Science and Technology

首页> 外文期刊>日本音響学会誌/The Journal of the Acoustical Society of Japan >Abstracts of Papers in Acoustical Science and Technology

【24h】

Abstracts of Papers in Acoustical Science and Technology

机译：声学科学与技术的论文摘要

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose non-parallel and many-to-many voice conversion (VC) using variational autoencoders (VAEs) that constructs VC models for converting arbitrary speakers' characteristics into those of other arbitrary speakers without parallel speech corpora for training the models. Although VAEs conditioned by one-hot coded speaker codes can achieve non-parallel VC, the phonetic contents of the converted speech tend to vanish, resulting in degraded speech quality. Another issue is that they cannot deal with unseen speakers not included in training corpora. To overcome these issues, we incorporate deep-neural-network-based automatic speech recognition (ASR) and automatic speaker verification (ASV) into the VAE-based VC. Since phonetic contents are given as phonetic posteriorgrams predicted from the ASR models, the proposed VC can overcome the quality degradation. Our VC utilizes d-vec-tors extracted from the ASV models as continuous speaker representations that can deal with unseen speakers. Experimental results demonstrate that our VC outperforms the conventional VAE-based VC in terms of mel-cepstral distortion and converted speech quality. We also investigate the effects of hyperparameters in our VC and reveal that 1) a large d-vector dimensionality that gives the better ASV performance does not necessarily improve converted speech quality, and 2) a large number of pre-stored speakers improves the quality.

机译：我们提出了使用变化的AutoEncoders（VAE）的非平行和多对多的语音转换（VC），该转换器构建VC模型，用于将任意扬声器的特性转换为其他任意扬声器的特征，而无需并行语音语料库来培训模型。虽然由单热编码扬声器代码调节的VAE可以实现非平行VC，但转换后的语音的语音内容倾向于消失，导致语音质量降级。另一个问题是，他们无法应对不包括在培训的看不见者。为了克服这些问题，我们将深度神经网络的自动语音识别（ASR）和自动扬声器验证（ASV）纳入基于VAE的VC。由于给出了从ASR模型预测的语音后验的语音内容，所以提出的VC可以克服质量劣化。我们的VC利用从ASV型号提取的D-Vec-Tors作为可处理看不见者的连续扬声器表示。实验结果表明，我们的VC在Mel-Cepstral失真和转换语音质量方面优于传统的VAE基VC。我们还调查了高级参数在VC中的效果，并揭示了1）一个大的D形维度，提供更好的ASV性能不一定改善转换的语音质量，而2）大量预先存储的扬声器提高了质量。

著录项

来源
《日本音響学会誌/The Journal of the Acoustical Society of Japan》 |2021年第1期|3-4|共2页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-18 22:58:41

相似文献

外文文献
中文文献
专利

1. Abstracts of Papers in　Acoustical Science and Technology [J] . 日本音響学会誌/The Journal of the Acoustical Society of Japan . 2021,第3期

机译：声学科学与技术的论文摘要
2. Abstracts of Papers in Acoustical Science and Technology [J] . 日本音響学会誌/The Journal of the Acoustical Society of Japan . 2021,第11期

机译：声学科学与技术论文的摘要
3. Abstracts of Papers in Acoustical Science and Technology [J] . 日本音響学会誌/The Journal of the Acoustical Society of Japan . 2021,第5期

机译：声学科学与技术的论文摘要
4. Analyzing the Composition of Academic Papers in Science and Technology : — a case study of highly cited academic research papers on authentication scheme [C] . Xia. Sun, Rui. Zhao International Conference on Culture-oriented Science and Technology . 2020

机译：分析科学技术领域的学术论文的构成：—以关于认证方案的高引用学术研究论文为例
5. Application of fiber loading technology to improve paper strength and optical properties of lightwieght, high opacity printing and copy paper. [D] . Doelle, Klaus. 2002

机译：纤维加载技术的应用可改善轻质，高不透明度打印和复印纸的纸张强度和光学性能。
6. Society of Sports Sciences. Abstracts of papers presented at Loughborough University of Technology. Saturday 3rd April 1982. [O] . 1982

机译：体育科学学会。拉夫堡理工大学发表的论文摘要。 1982年4月3日星期六。
7. EEG Signal Discrimination using Non-linear Dynamics in the EMD Domain S. M. Shafiul Alam,S. M. Shafiul Alam,Aurangozeb, and Syed TarekShahriar Abstract—An EMD-chaos based approach is proposed todiscriminate EEG signals corresponding to healthy persons,and epileptic patients during seizure-free intervals and seizureattacks. An electroencephalogram (EEG) is first empiricallydecomposed to intrinsic mode functions (IMFs). The nonlineardynamics of these IMFs are quantified in terms of the largestLyapunov exponent (LLE) and correlation dimension (CD).This chaotic analysis in EMD domain is applied to a large groupof EEG signals corresponding to healthy persons as well asepileptic patients (both with and without seizure attacks). It isshown that the values of the obtained LLE and CD exhibitfeatures by which EEG for seizure attacks can be clearlydistinguished from other EEG signals in the EMD domain.Thus, the proposed approach may aid researchers in developingeffective techniques to predict seizure activities. Index Terms—Electroencephalogram (EEG), empiricalmode decomposition (EMD), largest Lyapunov exponent (LLE),correlation dimension (CD), epileptic seizures. The Authors are with the Electrical and Electronic EngineeringDepartment, Bangladesh University of Engineering and Technology,Dhaka-1000, Bangladesh (e-mail: imamul@eee.buet.ac.bd) PDF Cite: S. M. Shafiul Alam,S. M. Shafiul Alam,Aurangozeb, and Syed Tarek Shahriar, "EEG Signal Discrimination using Non-linear Dynamics in the EMD Domain," International Journal of Computer and Electrical Engineering vol. 4, no. 3, pp. 326-330, 2012. PREVIOUS PAPER Perception of Emotions Using Constructive Learningthrough Speech NEXT PAPER Physical Layer Impairments Aware OVPN Connection Selection Mechanisms Copyright © 2008-2013. International Association of Computer Science and Information Technology Press (IACSIT Press) [O] . S. M. Shafiul Alam, Syed TarekShahriar 2012

机译：EEG信号在EMD域S. S. Shafiul Alam，S中的非线性动力学使用非线性动力学。 M. Shafiul Alam，Aurangozeb和Syed Tarekshahriar摘要 - 基于EMD Chaos的方法，提出了对应于健康人的EEG信号，癫痫发作期间的癫痫患者和Seizureattacks。脑电图（EEG）首先被凭经上分解为内在模式功能（IMF）。这些IMF的非线性动力学在最大范围的指数（LLE）和相关尺寸（CD）方面是量化的。本域中的混沌分析应用于与健康人相对应的大型脑电图（Asepileptic患者）（两者都有癫痫发作）。因此，所获得的LLE和CD表展的价值可以从EMD领域的其他EEG信号中清晰地区分脑电图的表达展示。本拟议的方法可以帮助研究人员以预测癫痫发作的癫痫发作技术。索引术语 - 脑电图（EEG），仿真态分解（EMD），最大的Lyapunov指数（LLE），相关维度（CD），癫痫发作。作者与电气电子和电子工程公司，孟加拉国工程和技术大学，孟加拉国达卡 - 1000（电子邮件：imamul@eee.buet.ac.bd）pdf cite：s. m. shafiul Alam，s。 M. Shafiul Alam，Aurangozeb和Syed Tarek Shahriar，“EEG信号歧视在EMD领域的非线性动态，”计算机电气工程卷国际杂志。 4，不。 3，pp。326-330,2012，上一篇论文对情绪的看法，使用建设性的学习言论下一篇论文物理层障碍意识到OVPN连接选择机制版权所有©2008-2013。国际计算机科学与信息技术协会出版社（IACSIT Press）

Abstracts of Papers in Acoustical Science and Technology

摘要

著录项

相似文献

相关主题

期刊订阅