Fast vocabulary acquisition in an NMF-based self-learning vocal user interface

Bart Ons; Jort F. Gemmeke; Hugo Van hamme

首页> 外文期刊>Computer speech and language >Fast vocabulary acquisition in an NMF-based self-learning vocal user interface

【24h】

Fast vocabulary acquisition in an NMF-based self-learning vocal user interface

机译：基于NMF的自学语音用户界面中的快速词汇习得

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In command-and-control applications, a vocal user interface (VUI) is useful for handsfree control of various devices, especially for people with a physical disability. The spoken utterances are usually restricted to a predefined list of phrases or to a restricted grammar, and the acoustic models work well for normal speech. While some state-of-the-art methods allow for user adaptation of the predefined acoustic models and lexicons, we pursue a fully adaptive VUI by learning both vocabulary and acoustics directly from interaction examples. A learning curve usually has a steep rise in the beginning and an asymptotic ceiling at the end. To limit tutoring time and to guarantee good performance in the long run, the word learning rate of the VUI should be fast and the learning curve should level off at a high accuracy. In order to deal with these performance indicators, we propose a multi-level VUI architecture and we investigate the effectiveness of alternative processing schemes. In the low-level layer, we explore the use of MIDA features (Mutual Information Discrimination Analysis) against conventional MFCC features. In the mid-level layer, we enhance the acoustic representation by means of phone posteriorgrams and clustering procedures. In the high-level layer, we use the NMF (Non-negative Matrix Factorization) procedure which has been demonstrated to be an effective approach for word learning. We evaluate and discuss the performance and the feasibility of our approach in a realistic experimental setting of the VUI-user learning context.

机译：在命令和控制应用程序中，语音用户界面（VUI）对于各种设备的免提控制非常有用，尤其是对于肢体残疾的人。语音通常仅限于预定义的短语列表或受限的语法，并且声学模型对于正常语音非常有效。虽然一些最新方法允许用户适应预定义的声学模型和词典，但我们通过直接从交互示例中学习词汇和声学来追求完全自适应的VUI。学习曲线通常在开始时陡峭上升，在结束时渐近上限。为了限制补习时间并确保长期良好的性能，VUI的单词学习速度应该很快，并且学习曲线应保持高精度。为了处理这些性能指标，我们提出了一个多级VUI体系结构，并研究了替代处理方案的有效性。在低层，我们探索了MIDA功能（相互信息区分分析）与常规MFCC功能的结合使用。在中层，我们通过电话后序图和聚类程序增强声学表示。在高层，我们使用NMF（非负矩阵分解）过程，该过程已被证明是一种有效的单词学习方法。我们在VUI用户学习环境的实际实验环境中评估和讨论了我们的方法的性能和可行性。

著录项

来源
《Computer speech and language》 |2014年第4期|997-1017|共21页
作者
Bart Ons; Jort F. Gemmeke; Hugo Van hamme;
展开▼
作者单位

Department ESAT-PSI, KU Leuven, Leuven, Belgium;

Department ESAT-PSI, KU Leuven, Leuven, Belgium;

Department ESAT-PSI, KU Leuven, Leuven, Belgium;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
MIDA; Phone posteriorgram; NMF; Fast learning; Vocabulary acquisition;

机译：MIDA;电话后图;NMF;快速学习;词汇习得;

相似文献

外文文献
中文文献
专利

1. Acquisition of a tactile-alone vocabulary by normally hearing users of the Tickle Talker. [J] . Galvin KL, Oerlemans M, Cowan RS, The Journal of the Acoustical Society of America . 1999,第2期

机译：通常由发痒的谈话者的用户听到的触觉词汇。
2. Integrating Dynamic Systems materials into a Mechanical Engineering curriculum through innovative use of Web-based acquisition and hands-on application and use of virtual Graphical User Interfaces - Part 5: Graphical User Interfaces (GUIs) Assist in [J] . Avitabile P Experimental Techniques . 2008,第3期

机译：通过基于Web的获取和动手应用的创新使用以及虚拟图形用户界面的使用，将Dynamic Systems材料集成到机械工程课程中-第5部分：图形用户界面（GUI）辅助
3. State-space control for linear hydraulic drive with user-friendly interface and self-learning component [J] . M. Konemund, C. Wurmthaler, J. Adamy, Olhydraulik und Pneumatik: Zeitschrift fur Fluidtechnik·Aktorik, Steuerelektronik und Sensorik . 1998,第10期

机译：具有用户友好界面和自学习组件的线性液压驱动器的状态空间控制
4. Towards a Self-Learning Assistive Vocal Interface: Vocabulary and Grammar Learning [C] . Janneke van de Loo, Jort F. Gemmeke, Guy De Pauw, 1st Workshop on speech and multimodal interaction in assistive environments 2012 . 2012

机译：走向自学辅助人声界面：词汇和语法学习
5. Expandable grids: A user interface visualization technique and a policy semantics to support fast, accurate security and privacy policy authoring [D] . Reeder, Robert W. 2008

机译：可扩展网格：用户界面可视化技术和策略语义，可支持快速，准确的安全性和隐私策略创作
6. Cognitive evaluation of the user interface and vocabulary of an outpatient information system. [O] . A. Kushniruk, V. Patel, J. J. Cimino, 1996

机译：门诊信息系统的用户界面和词汇的认知评估。
7. Fast vocabulary acquisition in an NMF-based self-learning vocal user interface [O] . Ons Bart, Gemmeke Jort, Van hamme Hugo 2014

机译：基于NMF的自学语音用户界面中的快速词汇习得

Fast vocabulary acquisition in an NMF-based self-learning vocal user interface

摘要

著录项

相似文献

相关主题

期刊订阅