首页> 外国专利> Apparatus and method for building domain-specific language models

Apparatus and method for building domain-specific language models

机译：用于建立领域特定语言模型的设备和方法

页面导航

摘要
著录项
相似文献

摘要

Disclosed is a method and apparatus for building a domain-specific language model for use in language processing applications, e.g., speech recognition. A reference language model is generated based on a relatively small seed corpus containing linguistic units relevant to the domain. An external corpus containing a large number of linguistic units is accessed. Using the reference language model, linguistic units which have a sufficient degree of relevance to the domain are extracted from the external corpus. The reference language model is then updated based on the seed corpus and the extracted linguistic units. The process may be repeated iteratively until the language model is of satisfactory quality. The language building technique may be further enhanced by combining it with mixture modeling or class-based modeling.

机译：公开了一种用于构建用于语言处理应用（例如语音识别）的领域特定语言模型的方法和装置。基于相对较小的种子语料库生成参考语言模型，该种子语料库包含与该领域相关的语言单元。访问包含大量语言单元的外部语料库。使用参考语言模型，从外部语料库中提取与域具有足够相关性的语言单元。然后根据种子语料库和提取的语言单元更新参考语言模型。该过程可以迭代地重复直到语言模型具有令人满意的质量。通过将其与混合建模或基于类的建模相结合，可以进一步增强语言构建技术。

著录项

公开/公告号US6188976B1

专利类型
公开/公告日2001-02-13

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US19980178026
发明设计人 GANESH N. RAMASWAMY;HARRY W. PRINTZ;PONANI S. GOPALAKRISHNAN;
展开▼

申请日1998-10-23
分类号G06F172/00;G06F172/70;G10L150/00;
国家 US
入库时间 2022-08-22 01:05:12

相似文献

专利
外文文献
中文文献