Assessment of general applicability of robot audition system by recognizing three simultaneous speeches

机译：通过同时识别三个语音来评估机器人试听系统的普遍适用性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Robot audition is a critical technology in creating an intelligent robot operating in daily environments. We have developed such a robot audition system by using a new interface between sound source separation and automatic speech recognition (ASR). A mixture of speeches captured with a pair of microphones installed in the ear positions of a humanoid is separated into each speech by using active direction-pass filter (ADPF). The ADPF extracts a sound source originating from a specific direction in real-time by using interaural phase and intensity differences. The separated speech is recognized by a speech recognizer based on the missing feature theory (MFT). By using a missing feature mask, the MFT based ASR neglects distorted and missing features caused during the speech separation. A missing feature mask for each separated speech is generated in speech separation and is sent to the ASR with the separated speech. Thus, this new integration improves the performance of ASR. However, the generality of this robot audition system has not been assessed so far. In this paper, we assess its general applicability by implementing it on the three humanoids, i.e., ASIMO of Honda, SIG2, and Replie of Kyoto University. By using three simultaneous speeches as benchmarks, the robot audition system improved the performance of ASR over 50% in every humanoid, and thus its general applicability was confirmed.

机译：机器人试听是创建在日常环境中运行的智能机器人的一项关键技术。通过使用声源分离和自动语音识别（ASR）之间的新接口，我们已经开发了这样的机器人试听系统。通过使用有源方向通滤波器（ADPF），可以将用安装在人形机器人耳朵位置的一对麦克风捕获的语音混合成每个语音。 ADPF通过使用双耳相位和强度差异实时提取源自特定方向的声源。语音识别器根据缺失特征理论（MFT）识别分离出的语音。通过使用缺少的特征蒙版，基于MFT的ASR可以忽略语音分离过程中导致的失真和缺少的特征。在语音分离中生成每个分离语音的缺失特征掩码，并将其与分离语音一起发送到ASR。因此，这种新的集成提高了ASR的性能。但是，到目前为止，尚未评估该机器人试听系统的一般性。在本文中，我们通过将其应用于本田的ASIMO，本田的ASIMO，SIG2和京都大学的Replie这三个类人动物来评估其普遍适用性。通过同时使用三个语音作为基准，该机器人试听系统在每个类人动物中将ASR的性能提高了50％以上，从而确认了其普遍适用性。

著录项

来源
《Intelligent Robots and Systems, 2004. (IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on》|2004年|p.2111-2116|共6页
会议地点
作者
Yamamoto; S.; Nakadai; K.; Tsujino; H.; Okuno; H.G.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词
intelligent robots; speech recognition; humanoid robots; active filters; robot audition system; intelligent robot; sound source separation; automatic speech recognition; humanoid robot; active direction-pass filter; missing feature theory;

机译：智能机器人;语音识别;类人机器人;有源滤波器;机器人试听系统;智能机器人;声源分离;自动语音识别;人形机器人;主动方向通过滤波器;缺失特征理论;

相似文献

外文文献
中文文献
专利

1. A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition [J] . Keisuke Nakamura, Kazuhiro Nakadai, Hiroshi G. Okuno Advanced Robotics: The International Journal of the Robotics Society of Japan . 2013,第11a12期

机译：实时超分辨率机器人试听系统，可提高同时语音识别的鲁棒性
2. Design and Implementation of Robot Audition System 'HARK' - Open Source Software for Listening to Three Simultaneous Speakers [J] . Kazuhiro Nakadai, Toru Takahashi, Hiroshi G. Okuno, Advanced Robotics: The International Journal of the Robotics Society of Japan . 2010,第5a6期

机译：机器人试听系统“ HARK”的设计与实现-可以同时收听三位发言人的开源软件
3. Assessment of the Applicability of Independent Brain-Computer Interfaces in Robotic Systems [J] . R. A. Faizrakhmanov, R. R. Bakunov Russian electrical engineering . 2014,第11期

机译：机器人系统中独立的脑机接口的适用性评估
4. Assessment of general applicability of robot audition system by recognizing three simultaneous speeches [C] . Yamamoto S., Nakadai K., Tsujino H., IEEE/RSJ International Conference on Intelligent Robots and Systems . 2004

机译：通过识别三次同时演讲评估机器人试镜系统的一般适用性
5. TV Simultaneous Interpreting of Emotive Overtones in Arabic Presidential Political Speeches into English during the Arab Spring [D] . Al-Jabri, Hanan. 2017

机译：电视在阿拉伯之春同时翻译阿拉伯语总统政治演讲中的情绪基调
6. A Human Interactive Hybrid FES-Robotic System Applicable to Improvement of Foot Drop after Stroke: Case Report of a Patient with Chronic Stroke [O] . Hamid Reza Kobravi, Yadollah Farzaneh, Milad Faryar Majd, 2020

机译：一种人类交互式杂交FES机器机器人系统适用于卒中后脚下的改善：慢性卒中患者的病例报告
7. Real-Time Robot Audition System That Recognizes Simultaneous Speech in The Real World [O] . Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, 2008

机译：实时机器人试听系统识别现实世界中的同声发音

Assessment of general applicability of robot audition system by recognizing three simultaneous speeches

摘要

著录项

相似文献

相关主题

期刊订阅