首页> 外文会议>International Conference on Natural Language Processing and Knowledge Engineering; 20031026-20031029; Beijing; CN >STYLE-SPECIFIC LANGUAGE MODEL ADAPTATION FOR KOREAN CONVERSATIONAL SPEECH RECOGNITION
【24h】

STYLE-SPECIFIC LANGUAGE MODEL ADAPTATION FOR KOREAN CONVERSATIONAL SPEECH RECOGNITION

机译:韩语会话语音识别的特定风格语言模型自适应

获取原文
获取原文并翻译 | 示例

摘要

This paper presents our style-specific language model adaptation method for Korean conversational speech recognition. Compared with the written text corpora, conversational speech shows different characteristics of content and style such as filled pauses, word I omission, and contraction, which are related to function words and depend on preceding or following words in Korean spontaneous speech. Since obtaining sufficient data for training language model is often difficult in a conversational domain, language model adaptation with large out-of-domain data is useful. For style-specific language model adaptation, first, we estimate in-domain dependent n-gram model by relevance weighting of out-of-domain text data according to style and content similarity. Here, style is represented by n-gram based tf~*idf, similarity. Second, we train in-domain language model including disfluency model. Recognition results show that n-gram based tf~*idf similarity weighting effectively reflects style difference and disfluencies can be used as a good predictor to the neighboring words.
机译:本文介绍了针对朝鲜语会话语音识别的特定风格的语言模型自适应方法。与书面语料库相比,会话语音表现出不同的内容和风格特征,例如填充的停顿,单词I遗漏和收缩,这与功能性单词有关,并且取决于韩语自发语音中的前后单词。由于在对话域中通常难以获得足够的数据来训练语言模型,因此使用具有大量域外数据的语言模型进行调整非常有用。对于特定于样式的语言模型适应,首先,我们根据样式和内容的相似性通过对域外文本数据的相关性加权来估计域内依赖的n元语法模型。在此,样式由基于n-gram的tf〜* idf表示。其次,我们训练领域内语言模型,包括不满模型。识别结果表明,基于n-gram的tf〜* idf相似度加权有效地反映了风格差异,并且流离失所可以用作相邻词的良好预测指标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号