Artificial Auditory Recognition in Telephony

首页> 外文期刊>IBM Journal of Research and Development >Artificial Auditory Recognition in Telephony

【24h】

Artificial Auditory Recognition in Telephony

机译：电话中的人工听觉识别

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machines which automatically recognize patterns from a stream of acoustic events, for example a spoken command, would have great utility in both communications and data processing. This paper reviews two applications of an elementary recognizer to the problem of actuating certain logical functions, and indicates how more ambitious recognizers might be utilized. In this regard, the automatic measurement of a talker''s voice pitch and voicing dynamics appears fundamental to speech analysis, and hence to many recognition schemes. Visual inspection of spectral data taken from different speakers supports this contention. Segmentation of speech into discrete units suitable for recognition, including the possibility of overlapping elements, is discussed. There is reason to expect that such segments will span several elementary speech sounds (phonemes). To illustrate this approach, a set of rules is presented for associating visual spectral displays (sound spectrograms) with the perception evoked by the corresponding utterances. These rules are specifically tailored for a limited vocabulary consisting of ten spoken numbers, and were validated by naive subjects who used them to identify the utterances of 33 people. In a further experiment, spectrograms of the same material from 14 talkers were simplified by reducing them to binary elements. It was found that master patterns for each number, compiled from the ensemble of talkers, could identify the utterances with over 99% success. These results emphasize a “diversity” approach to speech recognition which operates on relations between gross spectral features and does not depend exclusively on any one property.

机译：自动从声音事件流（例如语音命令）中识别模式的机器在通信和数据处理方面都将具有很大的实用性。本文回顾了基本识别器在激活某些逻辑功能问题上的两种应用，并指出了如何使用更具野心的识别器。在这一点上，对语音分析者来说，自动测量通话者的音调和发声动态显得很重要，因此对于许多识别方案来说也很重要。目视检查从不同扬声器获取的频谱数据可以支持这种观点。讨论了将语音分割为适合识别的离散单元，包括重叠元素的可能性。有理由期望这样的片段将跨越几种基本语音（音素）。为了说明这种方法，提出了一组规则，用于将视觉频谱显示（声音频谱图）与相应话语引起的感知相关联。这些规则专门针对由十个口语数字组成的有限词汇量身定制，并已由幼稚的主体验证，他们使用它们来识别33个人的话语。在进一步的实验中，来自14个讲话者的相同材料的声谱图通过简化为二进制元素而得到简化。人们发现，通过说话者的合奏编制出的每个数字的主模式可以识别出超过99％成功的话语。这些结果强调了语音识别的“多样性”方法，该方法基于总频谱特征之间的关系起作用，并且不仅仅依赖于任何一个属性。

著录项

来源
《IBM Journal of Research and Development》 |1958年第4期|P.294-309|共16页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-17 13:27:28

相似文献

外文文献
中文文献
专利

1. Auditory-visual speech recognition by hearing-impaired subjects: Consonant recognition, sentence recognition, and auditory-visual integration [J] . Ken W. Grant, Brian E. Walden, Philip F. Seitz The Journal of the Acoustical Society of America . 1998,第5期

机译：听力障碍者的听觉-视觉语音识别：辅音识别，句子识别和听觉-视觉融合
2. Computational auditory models in predicting noise reduction performance for wideband telephony applications [J] . Nazanin Pourmand, Vijay Parsa, Angela Weaver International journal of speech technology . 2013,第4期

机译：用于预测宽带电话应用的降噪性能的计算听觉模型
3. Computational auditory models in predicting noise reduction performance for wideband telephony applications [J] . Nazanin Pourmand, Vijay Parsa, Angela Weaver International Journal of Speech Technology . 2013,第4期

机译：用于预测宽带电话应用的降噪性能的计算听觉模型
4. Artificial Perception: Auditory Decomposition of Mixtures of Environmental Sounds - Combining Information Theoretical and Supervised Pattern Recognition Approaches [C] . Ladislava Janku Conference on Current Trends in Theory and Practice of Computer Science(SOFSEM 2004); 20040124-20040130; Merin; CZ . 2004

机译：人工感知：环境声音混合的听觉分解-结合信息理论和监督模式识别方法
5. A Customizable Artificial Auditory Fovea [D] . Casebeer, Christopher Ness. 2018

机译：可定制的人工听觉fovea
6. Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use [O] . Anja Gieseler, Maike A. S. Tahden, Christiane M. Thiel, -1

机译：听觉和非听觉对噪声中助听器语音识别的贡献与助听器功能的关系
7. Demand for fixed and mobile telephony: An application of artificial neural networks. [O] . Andrés Milton Coca Carasila, Juan Villagómez Méndez 100

机译：对固定和移动电话的需求：人工神经网络的应用。

Artificial Auditory Recognition in Telephony

摘要

著录项

相似文献

相关主题

期刊订阅