Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus

机译：印地语语音的结构分析和从非常大的印地语文本语料库中提取语音丰富的句子的方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic speech recognition (ASR) and Text to speech (TTS) are two prominent area of research in human computer interaction nowadays. A set of phonetically rich sentences is in a matter of importance in order to develop these two interactive modules of HCI. Essentially, the set of phonetically rich sentences has to cover all possible phone units distributed uniformly. Selecting such a set from a big corpus with maintaining phonetic characteristic based similarity is still a challenging problem. The major objective of this paper is to devise a criteria in order to select a set of sentences encompassing all phonetic aspects of a corpus with size as minimum as possible. First, this paper presents a statistical analysis of Hindi phonetics by observing the structural characteristics. Further a two stage algorithm is proposed to extract phonetically rich sentences with a high variety of triphones from the EMILLE Hindi corpus. The algorithm consists of a distance measuring criteria to select a sentence in order to improve the triphone distribution. Moreover, a special preprocessing method is proposed to score each triphone in terms of inverse probability in order to fasten the algorithm. The results show that the approach efficiently build uniformly distributed phonetically-rich corpus with optimum number of sentences.

机译：自动语音识别（ASR）和文本语音转换（TTS）是当今人机交互研究的两个重要领域。为了开发HCI的这两个交互模块，一组语音丰富的句子非常重要。本质上，这组语音丰富的句子必须覆盖所有可能均匀分布的电话单元。从大型语料库中选择这样的集合并保持基于语音特征的相似性仍然是一个难题。本文的主要目的是设计一种标准，以选择一组涵盖语料库所有语音方面的句子，并且其大小应尽可能小。首先，本文通过观察结构特征来对印地语语音进行统计分析。此外，提出了一种两阶段算法，用于从EMILLE印地语语料库中提取具有多种三音素的丰富语音的句子。该算法包括一个测距标准以选择一个句子，以改善三音素的分布。此外，为了固定算法，提出了一种特殊的预处理方法来对每个三音手机进行逆概率评分。结果表明，该方法有效地构建了具有最佳句子数的，均匀分布的语音丰富的语料库。

著录项

来源
《2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Technique》|2016年|188-193|共6页
会议地点 Bali(ID)
作者
Shrikant Malviya; Rohit Mishra; Uma Shanker Tiwary;
展开▼
作者单位

Department of Information Technology, Indian Institute of Information Technology, Allahabad, India 211012;

Department of Information Technology, Indian Institute of Information Technology, Allahabad, India 211012;

Department of Information Technology, Indian Institute of Information Technology, Allahabad, India 211012;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Decision support systems;

机译：决策支持系统;
入库时间 2022-08-26 14:30:25

相似文献

外文文献
中文文献
专利

1. Incorporating finer acoustic phonetic features in lexicon for Hindi language speech recognition [J] . Journal of information and optimization sciences . 2019,第8期

机译：在词典中纳入更精细的声学语音特征以进行印地语语音识别
2. Development of Hindi speech stimuli to elicit auditory brainstem responses: Necessity and acoustic-phonetic considerations [J] . MOHAMMAD SHAMIM ANSARI, R. RANGASAYEE Hearing, balance and communication. . 2016,第3a4期

机译：印地语语音刺激的发展，以引起听觉脑干反应：必要性和声学注意事项
3. ACOUSTIC-PHONETIC FEATURE BASED DIALECT IDENTIFICATION IN HINDI SPEECH [J] . Shweta Sinha, Aruna Jain, S. S. Agrawal International Journal on Smart Sensing and Intelligent Systems . 2015,第1期

机译：基于语音特征的印度语语音方言识别
4. Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus [C] . Shrikant Malviya, Rohit Mishra, Uma Shanker Tiwary Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Technique . 2016

机译：印地语语音学的结构分析及其从一个非常大的印地语文本语料库提取语音富句的方法
5. Working memory in sentence comprehension: Processing Hindi center embeddings. [D] . Vasishth, Shravan. 2002

机译：句子理解中的工作记忆：处理印地语中心嵌入。
6. Perceptual Doping: An Audiovisual Facilitation Effect on Auditory Speech Processing From Phonetic Feature Extraction to Sentence Identification in Noise [O] . Shahram Moradi, Björn Lidestam, Elaine Hoi Ning Ng, -1

机译：知觉兴奋剂：从语音特征提取到噪声中的句子识别对听觉语音处理的视听促进作用
7. Structural Analysis of Hindi Phonetics and A Method for Extraction of Phonetically Rich Sentences from a Very Large Hindi Text Corpus [O] . Malviya, Shrikant, Mishra, Rohit, Tiwary, Uma Shanker 2017

机译：印地语语音的结构分析及其提取方法来自非常大的印地语文本语料库的发音丰富的句子
8. Phonetic and Structural Encoding of Chinese Characters in Chinese Texts [R] . Boitet, C., Tcheou, F. X. 1990

机译：汉语语篇汉字的语音和结构编码

Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus

摘要

著录项

相似文献

相关主题

期刊订阅