Improving Human-computer Interaction in Low-resource Settings with Text-to-phonetic Data Augmentation

机译：通过文本到语音的数据增强在资源不足的环境中改善人机交互

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Off-the-shelf speech recognition systems can yield useful results and accelerate application development, but general-purpose systems applied to specialized domains can introduce acoustically small-but semantically catastrophic-errors. Furthermore, sufficient audio data may not be available to develop custom acoustic models for niche tasks. To address these problems, we propose a concept to improve performance in text classification tasks that use speech transcripts as input, without any in-domain audio data. Our method augments available typewritten text training data with inferred phonetic information so that the classifier will learn semantically important acoustic regularities, making it more robust to transcription errors from the general purpose ASR. We successfully pilot our method in a speech-based virtual patient used for medical training, recovering up to 62% of errors incurred by feeding a small test set of speech transcripts to a classification model trained on typescript.

机译：现成的语音识别系统可以产生有用的结果并加速应用程序开发，但是应用于特定领域的通用系统可能会引入听觉上很小但语义上灾难性的错误。此外，可能没有足够的音频数据可用于开发利基任务的自定义声学模型。为了解决这些问题，我们提出了一个概念，以提高使用语音成绩单作为输入而没有任何域内音频数据的文本分类任务的性能。我们的方法利用推断出的语音信息扩充了可用的打字文本训练数据，从而使分类器将学习语义上重要的声学规律性，从而使其对于来自通用ASR的转录错误更加健壮。我们成功地在用于医疗培训的基于语音的虚拟患者中试用了我们的方法，通过将少量的语音成绩单测试集输入到在打字稿上训练的分类模型中，可以恢复高达62％的错误。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2019年|7320-7324|共5页
会议地点
作者
Adam Stiff; Prashant Serai; Eric Fosler-Lussier;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech recognition; Phonetics; Training; Task analysis; Acoustics; Computational modeling; Data models;

机译：语音识别;语音;培训;任务分析;声学;计算模型;数据模型;

相似文献

外文文献
中文文献
专利

1. Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion [J] . Chenggang Mi, Shaolin Zhu, Rui Nie Computational intelligence and neuroscience . 2021,第a期

机译：利用数据增强和多个特征融合，在低资源语言中提高笔记识别
2. PlethAugment: GAN-Based PPG Augmentation for Medical Diagnosis in Low-Resource Settings [J] . Kiyasseh Dani, Tadesse Girmaw Abebe, Nhan Le Nguyen Thanh, Biomedical and Health Informatics, IEEE Journal of . 2020,第11期

机译：妥善药：基于GaN的PPG用于低资源设置的医学诊断
3. Including ultrasound scans in antenatal care in low-resource settings: Considering the complementarity of obstetric ultrasound screening and maternity waiting homes in strengthening referral systems in low-resource, rural settings [J] . Swanson David L., Franklin Holly L., Swanson Jonathan O., Seminars in perinatology . 2019,第5期

机译：在低资源环境中的产前护理中包括超声波扫描：考虑到产科超声波筛查和产妇等待家庭在加强低资源，农村环境中的推荐系统中的互补性
4. Improving Human-computer Interaction in Low-resource Settings with Text-to-phonetic Data Augmentation [C] . Adam Stiff, Prashant Serai, Eric Fosler-Lussier IEEE International Conference on Acoustics, Speech and Signal Processing . 2019

机译：使用文本到语音数据增强，从而改善低资源设置中的人机交互
5. Improve Human Performance of Perceptual-motor Tasks in Human-computer Interaction through Integration of Cognitive Modeling and Data Mining Techniques. [D] . Lin, Cheng-Jhe (Robert). 2012

机译：通过认知建模和数据挖掘技术的集成，提高人机交互中感知运动任务的人类绩效。
6. Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion [O] . Chenggang Mi, Shaolin Zhu, Rui Nie 2021

机译：利用数据增强和多个特征融合在低资源语言中提高笔记识别
7. Fecal Indicator Bacteria Data to Characterize Drinking Water Quality in Low-Resource Settings: Summary of Current Practices and Recommendations for Improving Validity [O] . Mustafa Sikder, Elena N. Naumova, Anthonia O. Ogudipe, 2021

机译：粪便指标细菌数据在低资源环境中表征饮用水质量：当前做法和提高有效性的建议摘要

Improving Human-computer Interaction in Low-resource Settings with Text-to-phonetic Data Augmentation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅