A Novel Method of Language Modeling for Automatic Captioning in TC Video Teleconferencing

Zhang X.; Zhao Y.; Schopp L.

首页> 外文期刊>IEEE transactions on information technology in biomedicine >A Novel Method of Language Modeling for Automatic Captioning in TC Video Teleconferencing

【24h】

A Novel Method of Language Modeling for Automatic Captioning in TC Video Teleconferencing

机译：TC视频电话会议中自动字幕的语言建模新方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We are developing an automatic captioning system for teleconsultation video teleconferencing (TC-VTC) in telemedicine, based on large vocabulary conversational speech recognition. In TC-VTC, doctors'' speech contains a large number of infrequently used medical terms in spontaneous styles. Due to insufficiency of data, we adopted mixture language modeling, with models trained from several datasets of medical and nonmedical domains. This paper proposes novel modeling and estimation methods for the mixture language model (LM). Component LMs are trained from individual datasets, with class n-gram LMs trained from in-domain datasets and word n-gram LMs trained from out-of-domain datasets, and they are interpolated into a mixture LM. For class LMs, semantic categories are used for class definition on medical terms, names, and digits. The interpolation weights of a mixture LM are estimated by a greedy algorithm of forward weight adjustment (FWA). The proposed mixing of in-domain class LMs and out-of-domain word LMs, the semantic definitions of word classes, as well as the weight-estimation algorithm of FWA are effective on the TC-VTC task. As compared with using mixtures of word LMs with weights estimated by the conventional expectation-maximization algorithm, the proposed methods led to a 21% reduction of perplexity on test sets of five doctors, which translated into improvements of captioning accuracy

机译：我们正在开发基于大型词汇会话语音识别的远程医疗视频咨询会议（TC-VTC）的自动字幕系统。在TC-VTC中，医生的讲话包含大量不经常使用的自发医学术语。由于数据不足，我们采用了混合语言建模，并从医学和非医学领域的多个数据集中训练了模型。本文提出了一种新的混合语言模型（LM）建模和估计方法。组件LM是从各个数据集中训练的，而n-gram LM是从域内的数据集中训练的，单词n-gram LM是从域外的数据集中训练的，然后被插值到混合LM中。对于LM类，语义类别用于医学术语，名称和数字的类别定义。通过前向权重调整（FWA）的贪婪算法来估计混合物LM的内插权重。提出的域内类LM和域外单词LM的混合，单词类的语义定义以及FWA的权重估计算法对于TC-VTC任务是有效的。与使用由传统的期望最大化算法估计的权重的单词LM的混合相比，所提出的方法使五位医生的测试集的困惑度降低了21％，这转化为字幕准确性的提高

著录项

来源
《IEEE transactions on information technology in biomedicine》 |2007年第3期|p.332-337|共6页
作者
Zhang X.; Zhao Y.; Schopp L.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
expectation-maximisation algorithm; greedy algorithms; interpolation; linguistics; natural language processing; speech recognition; teleconferencing; telemedicine; automatic captioning system; captioning accuracy; expectation-maximization algorithm; forward weight;

机译：期望最大化算法;贪婪算法;插值;语言学;自然语言处理;语音识别;电话会议;远程医疗;自动字幕系统;字幕精度;期望最大化算法;前向权重;
入库时间 2022-08-18 01:02:06

相似文献

外文文献
中文文献
专利

1. Investigation of Automatic Speech Recognition Systems via the Multilingual Deep Neural Network Modeling Methods for a Very Low-Resource Language, Chaha [J] . Tessfu Geteye Fantaye, Junqing Yu, Tulu Tilahun Hailu Journal of Signal and Information Processing . 2020,第1期

机译：Chaha非常低于资源语言的多语言深神经网络建模方法对自动语音识别系统的研究
2. Investigation of Automatic Speech Recognition Systems via the Multilingual Deep Neural Network Modeling Methods for a Very Low-Resource Language, Chaha [J] . Tessfu Geteye Fantaye, Junqing Yu, Tulu Tilahun Hailu 信号与信息处理（英文） . 2020,第001期

机译：资源非常少的语言Chaha通过多语言深层神经网络建模方法研究自动语音识别系统
3. Action Sequence Recognition in Videos by Combining a CTC Networkwith a Statistical Language Model [J] . Mengxi LIN, Nakamasa INOUE, Koichi SHINODA 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2017,第362期

机译：通过将CTC Network联合统计语言模型来进行视频中的动作序列识别
4. Automatic face tracking and model match-move automatic face tracking and model match-move in video sequence using 3D face model in video sequence using 3D face model [C] . Misawa T., Murai K., Nakamura S., IEE Colloquium on Innovative Actuators for Mechatronic Systems, 1995 . 1995

机译：使用3D人脸模型在视频序列中的视频序列中的自动人脸跟踪和模型匹配移动
5. The effect of the use of videos captioning on English as a foreign language (EFL) on college students' language learning in Taiwan (China). [D] . Hwang, Yan-Ling. 2003

机译：在台湾（中国）使用视频字幕作为外语英语（EFL）对大学生语言学习的影响。
6. Language model-based automatic prefix abbreviation expansion method for biomedical big data analysis [O] . Xiaokun Du, Rongbo Zhu, Yanhong Li, -1

机译：基于语言模型的生物医学大数据分析自动前缀缩写扩展方法
7. DESAIN DAN IMPLEMENTASI AUTOMATIC VIDEO CAPTIONING DENGAN SPEECH RECOGNITION MENGGUNAKAN HIDDEN MARKOV MODEL [O] . Rama Dimasatria, Agus Virgono, R. Rumani M. R. Rumani M. 2016

机译：用隐马尔可夫模型与语音识别自动录像的设计与实现

A Novel Method of Language Modeling for Automatic Captioning in TC Video Teleconferencing

摘要

著录项

相似文献

相关主题

期刊订阅