首页> 外文会议>International conference on Asian language processing >Isolated digit filipino speech recognition through spectrogram image classification: Towards application in a disaster preparedness participatory toolkit

【24h】

Isolated digit filipino speech recognition through spectrogram image classification: Towards application in a disaster preparedness participatory toolkit

机译：通过频谱图图像分类实现的孤立数位菲律宾语音识别：在防灾参与工具包中的应用

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present our work on isolated digit speech recognition: by classifying spectrogram images and for use in a disaster preparedness participatory toolkit. To achieve higher inclusivity, we included a voice component for a wider coverage of respondents especially those who have low literacy and those vision impaired individuals. Our methodology is through speech recognition which is a deviation from usual approaches which normally work on acoustic coefficients and features. As our initial test bed, we focused on the Filipino language - a member of the Malayo-Polynesian language family and is the national language in the Philippines. Our data covers 4,297 utterances of the Filipino digits 0 to 9 collected from 262 speakers, and divided the data into 3 parts: 70% for training, 20% for testing, and 10% for validation. We applied short-time Fourier transform on our training data and we used convolution neural networks in MatLab to classify the spectrogram images. The lowest accuracy rate during our tests is 93.02%. Analyses of the results show that background noises are the cause of the misclassified utterances which will further discussed on this paper. While the results are promising, the work can be extended to include closely related languages.

机译：在本文中，我们介绍了我们在隔离数字语音识别方面的工作：通过对频谱图图像进行分类并将其用于防灾参与工具包中。为了获得更高的包容性，我们加入了语音组件，以覆盖更广泛的受访者，尤其是那些识字能力低和视力障碍者。我们的方法是通过语音识别，这与通常在声学系数和特征上起作用的常规方法有所不同。作为最初的测试平台，我们重点研究菲律宾语言-马来语-波利尼西亚语家族的成员，并且是菲律宾的国家语言。我们的数据涵盖了从262位演讲者那里收集的4,297菲律宾数字0至9语音，并将数据分为3个部分：70％用于训练，20％用于测试和10％用于验证。我们在训练数据上应用了短时傅立叶变换，并在MatLab中使用了卷积神经网络对光谱图图像进行分类。在我们的测试中，最低的准确率为93.02 \％。结果分析表明，背景噪声是发声错误分类的原因，本文将对此进行进一步讨论。虽然结果令人鼓舞，但可以将工作扩展到包括紧密相关的语言。

著录项

来源
《International conference on Asian language processing》|2017年|31-35|共5页
会议地点 Singapore(SG)
作者
Julie Ann A. Salido; Nathaniel Oco; Rachel Roxas; Emmanuel Malaay; Michael Simora; Ronald John Cabatic;
展开▼
作者单位

Aklan State University tNational University;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Spectrogram; Speech recognition; Training; Speech; Mathematical model; Sensitivity; Machine learning;

机译：频谱图语音识别;训练;言语;数学模型;灵敏度;机器学习;

相似文献

外文文献
中文文献
专利

1. Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement [J] . Joyner Cadore, Francisco J. Valverde-Albacete, Ascensión Gallardo-Antolín, Cognitive Computation . 2013,第4期

机译：语音频谱图的听觉启发式形态处理：在自动语音识别和语音增强中的应用
2. Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement [J] . Joyner Cadore, Francisco J. Valverde-Albacete, Ascensión Gallardo-Antolín, Cognitive computation . 2013,第4期

机译：语音频谱图的听觉启发式形态处理：在自动语音识别和语音增强中的应用
3. Investigation of the effect of spectrogram images and different texture analysis methods on speech emotion recognition [J] . Turgut Özseven Applied Acoustics . 2018,第DECa期

机译：频谱图图像和不同纹理分析方法对语音情感识别的影响研究
4. Isolated digit filipino speech recognition through spectrogram image classification: Towards application in a disaster preparedness participatory toolkit [C] . Julie Ann A. Salido, Nathaniel Oco, Rachel Roxas, International Conference on Asian Language Processing . 2017

机译：孤立的数字菲律宾语音识别通过频谱图图像分类：朝着备灾参与式工具包中的应用
5. Reconstructing Incomplete and Unreliable Speech Spectrogram for Robust Automatic Speech Recognition [D] . Badiezadegan, Shirin. 2015

机译：为强大的自动语音识别重建不完整和不可靠的语音谱图
6. Biologically-Inspired Spike-Based Automatic Speech Recognition of Isolated Digits Over a Reproducing Kernel Hilbert Space [O] . Kan Li, José C. Príncipe 2018

机译：仿生希尔伯特空间上基于数字启发的基于穗的孤立数字自动语音识别
7. Auditory-inspired morphological processing of speech spectrograms: applications in automatic speech recognition and speech enhancement [O] . Cadore Joyner, Valverde-Albacete Francisco J., Gallardo-Antolín Ascensión, 2012

机译：听觉启发的语音频谱图形态处理：自动语音识别和语音增强中的应用

Isolated digit filipino speech recognition through spectrogram image classification: Towards application in a disaster preparedness participatory toolkit

摘要

著录项

相似文献

相关主题

期刊订阅