首页> 外文会议>Signal Processing: Algorithms, Architectures, Arrangements, and Applications >Marking the Allophones Boundaries Based on the DTW Algorithm
【24h】

Marking the Allophones Boundaries Based on the DTW Algorithm

机译:基于DTW算法标记同音异界

获取原文

摘要

The paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighbourhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border between allophones is in some cases very difficult to determine. Nowadays, this task is carried out manually in cooperation with specialists in the field of phonetics. The presented approach allows to build a system that is able to automate this process. The aim of the work currently carried out by the author is a method that facilitates the training material processing for the needs of the development of multimodal speech recognition systems. For this purpose, the difficult problem of marking boundaries of allophones is solved in this report based on the Polish dictionary in the context of the creation of allophone bases for speech synthesis. This is done in this way due to the simplified possibility of organizing critical listening and subjective evaluation of received allophones by a large group of Polish native speakers (73 people). Strengthening the method will allow it to be used for the extraction of allophones for the needs of developed system of automatic transcription of English speech and for its notation according to the IPA standard. The analysed continuous speech is combined in the DTW algorithm with a synthesized speech signal. The comparison of both signals is performed not in the time domain as in the classical DTW, but in the frequency domain. This allows for a statement that the phonetic content of both signals is compared. The paper describes the process of marking the boundaries of allophones for the Polish language, however after appropriate modifications, this approach can be used to determine the allophones boundaries in other languages, especially for English.
机译:本文提出了一种基于动态时间规整(DTW)算法标记语音信号中音素的边界的方法。由于相邻音素彼此之间的相互影响,设置和标记连续语音中的音素边界是一个难题。正是这一邻域产生了音素的变体,即音素,另一方面,这影响了音素之间的边界在某些情况下很难确定。如今,这项任务是与语音领域的专家合作手动完成的。所提出的方法允许构建能够使该过程自动化的系统。作者当前正在进行的工作的目的是一种方法,该方法可以根据多模态语音识别系统的开发需求来简化培训材料。为此,在本报告的基础上,基于波兰语词典,在创建语音合成音素库的背景下解决了标记音素边界的难题。之所以如此,是因为简化了由一大批波兰语母语人士(73人)组织对接收到的同音素进行批判性聆听和主观评估的可能性。加强该方法将使其能够用于提取异音,以满足已开发的英语语音自动转录系统及其根据IPA标准的表示法的需要。在DTW算法中将分析后的连续语音与合成语音信号进行组合。两种信号的比较不是在传统DTW中的时域中进行,而是在频域中进行。这样就可以声明比较两个信号的语音内容。本文介绍了标记波兰语同音素边界的过程,但是,在进行适当修改后,可以使用这种方法来确定其他语言(尤其是英语)的同音素边界。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号