Automatic Language Identification Using Mixed-order Hmms And Untranscribed Corpora

机译：使用混合顺序Hmm和未转录语料库的自动语言识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The state-of-the-art language identification (LID) systems are based on phone recognisers and n-gram language models, which require the use of transcribed speech databases for training. An alternate solution to the LID problem directly applies mixed-order hidden Markov models (HMMs) to untranscribed speech. The competitive performance of these mixed-order HMMs on the NIST 1996 evaluation set is very promising, considering the ease of implementation and many possible improvements. This validates a novel mixed-order HMM training procedure and extends previous results obtained with high-order HMMs to take advantage of larger datasets.

机译：最新的语言识别（LID）系统基于电话识别器和n-gram语言模型，这需要使用转录语音数据库进行培训。 LID问题的另一种解决方案将混合顺序隐马尔可夫模型（HMM）直接应用于非转录语音。考虑到易于实施和许多可能的改进，这些混合顺序HMM在NIST 1996评估套件上的竞争性能非常有前途。这验证了一种新颖的混合阶HMM训练程序，并扩展了使用高阶HMM获得的先前结果，以利用更大的数据集。

著录项

来源
《6th International Conference on Spoken Language Processing ICSLP 2000 Oct.16-Oct.20 2000 Beijing International Convention Center, Beijing, China》|2000年|p.254-257|共4页
会议地点
作者
Ludwig Schwardt; Johan du Preez;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类世界各国文化与文化事业;
关键词

相似文献

外文文献
中文文献
专利

1. Automatic Extraction of Bilingual Word Pairs from Parallel Corpora with Various Languages Using Learning for Adjacent Information [J] . Hiroshi Echizen-ya, Kenji Araki, Yoshio Momouchi Systems and Computers in Japan . 2006,第13期

机译：通过学习相邻信息自动从平行语料库中提取双语单词对
2. Language identification in web documents using discrete HMMs [J] . Xafopoulos A, Kotropoulos C, Almpanidis G, Pattern Recognition: The Journal of the Pattern Recognition Society . 2004,第3期

机译：使用离散HMM在Web文档中进行语言识别
3. Automatic Detection of Language and Annotation Model Information in CoNLL Corpora [J] . Frank Abromeit, Christian Chiarcos OASIcs : OpenAccess Series in Informatics . 2019,第1期

机译：CoNLL语料库中语言和注释模型信息的自动检测
4. Automatic Language Identification Using Mixed-order Hmms And Untranscribed Corpora [C] . Ludwig Schwardt, Johan du Preez International conference on spoken language processing . 2000

机译：使用混合阶HMMS和未经筛查的语言识别自动语言识别
5. Automatic acquisition of lexical semantic knowledge from large corpora: The identification of semantically related words, markedness, polarity, and antonymy. [D] . Hatzivassiloglou, Vasileios. 1998

机译：从大型语料库自动获取词汇语义知识：识别与语义相关的单词，标记，极性和反义词。
6. Sublanguage Corpus Analysis Toolkit: A tool for assessing the representativeness and sublanguage characteristics of corpora [O] . Irina P. Temnikova, William A. Baumgartner Jr., Negacy D. Hailu, -1

机译：亚语言语料库分析工具包：一种用于评估语料库的代表性和亚语言特征的工具
7. Automatic Language Identification using Ergodic-HMM [O] . SantoshKumar SA, Ramasubramanian V 2005

机译：使用Ergodic-HMM进行自动语言识别

Automatic Language Identification Using Mixed-order Hmms And Untranscribed Corpora

摘要

著录项

相似文献

相关主题

期刊订阅