首页> 外文会议>WRI World Congress on Computer Science and Information Engineering >Language Model Based on Word Order Sensitive Matrix Representation in Latent Semantic Analysis for Speech Recognition
【24h】

Language Model Based on Word Order Sensitive Matrix Representation in Latent Semantic Analysis for Speech Recognition

机译:潜在语义分析中基于词序敏感矩阵表示的语言模型

获取原文

摘要

This paper investigates matrix representation in latent semantic analysis (LSA) framework for a language model. In LSA, word-document matrix is usually used to represent a corpus. However, this matrix ignores word order in the sentence. We propose several word co-occurrence matrices that keep word order to use in LSA. To support this matrix, we define a context dependent class (CDC) language model, which distinguishes classes according to their context in the sentences. Experiments on Wall Street Journal (WSJ) corpus show that the proposed method achieves better performance than the original LSA with word-document matrix.
机译:本文研究了语言模型的潜在语义分析(LSA)框架中的矩阵表示形式。在LSA中,单词文档矩阵通常用于表示语料库。但是,此矩阵忽略了句子中的单词顺序。我们提出了几种单词共现矩阵,这些矩阵可以保持单词在LSA中的使用顺序。为了支持此矩阵,我们定义了上下文相关类(CDC)语言模型,该模型根据句子中上下文的类别来区分它们。 《华尔街日报》(WSJ)语料库上的实验表明,所提出的方法比带有单词文档矩阵的原始LSA具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号