首页> 外文期刊>Delft outlook >Multimodal Speech Recognition: Lip-reading Automatons
【24h】

Multimodal Speech Recognition: Lip-reading Automatons

机译:多模式语音识别:朗读自动机

获取原文
获取原文并翻译 | 示例
           

摘要

Just that you are standing in the concourse of Rotterdam Central Station, and you can speak into a machine to ask it the time of the next train to Amsterdam, and an electronic voice will instantly tell you the answer, including the platform number. The TU Delft Mediamatics department has been collaboring for some years with OVR (Openbaar Vervoer Reisinformatie), a company that provides public transport information, to create systems for automatic speech recognition. So far, results have been nothing to write home about, certainly not when the information was requested from noisy places like station platforms. If the voice of the passenger on the platform is drowned in ambient noise, with its mixture of announcements including delayed trains, the computer gets confused. It is an established fact that other people are much easier to understand if you can see as well as hear them talk. It is not just the deaf who use lip-reading, for people with normal hearing will also resort to watching the speaker's mouth as the level of ambient noise increases. This has led to the idea of supporting automated speech recognition systems with software for automatic lip-reading. The system could also come in useful for hands-free phone calls in cars. A small camera could be pointed at the mouth of the speaker and a processor could analyse the video images in real time. Polish IT engineer Jacek Wojdel has developed a working prototype. Automatic speech recognition has been the focus of worldwide interest for over two decades. International companies have large research departments working on it. At Philips in Aachen, Germany alone some 150 researchers are active in the field. IBM has developed the Via Voice Speech System, and the Belgium company of Lernout & Hauspie, which recently went bankrupt, was also a major player.
机译:只是您正站在鹿特丹中央火车站的大厅里,并且您可以对着机器讲话,问它下一趟到达阿姆斯特丹的火车的时间,并且电子声音会立即告诉您答案,包括站台号。 TU Delft Mediamatics部门已经与提供公共交通信息的公司OVR(Openbaar Vervoer Reisinformatie)合作了多年,以创建用于自动语音识别的系统。到目前为止,没有什么可写的结果了,当然不是在从嘈杂的地方(如车站月台)索要信息的时候。如果平台上乘客的声音被周围的噪音所淹没,同时有各种通知,包括火车延误,计算机就会感到困惑。一个既定的事实是,如果您既能看到又能听到别人讲话,那么其他人会更容易理解。朗读的不仅是聋哑人,因为听力正常的人也会随着周围噪音水平的提高而观看说话者的嘴巴。这导致了用自动唇读软件来支持自动语音识别系统的想法。该系统还可用于汽车免提通话。小型摄像头可以对准扬声器的嘴,处理器可以实时分析视频图像。波兰IT工程师Jacek Wojdel开发了一个可运行的原型。二十多年来,自动语音识别一直是全球关注的焦点。国际公司拥有大型研究部门。仅在德国亚琛的飞利浦,就有约150名研究人员活跃于该领域。 IBM开发了Via Voice Speech System,最近破产的比利时Lernout&Hauspie公司也是主要参与者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号