Impact of Word Classing on Shrinkage-Based Language Models

机译：单词分类对基于收缩的语言模型的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper investigates the impact of word classing on a recently proposed shrinkage-based language model, Model M [5]. Model M, a class-based n-gram model, has been shown to significantly outperform word-based n-gram models on a variety of domains. In past work, word classes for Model M were induced automatically from unlabeled text using the algorithm of [2]. We take a closer look at the classing and attempt to find out whether improved classing would also translate to improved performance. In particular, we explore the use of manually-assigned classes, part-of-speech (POS) tags, and dialog state information, considering both hard classing and soft classing. In experiments with a conversational dialog system (human-machine dialog) and a speech-to-speech translation system (human-human dialog), we find that better classing can improve Model M performance by up to 3% absolute in word-error rate.

机译：本文研究了单词分类对最近提出的基于收缩的语言模型Model M [5]的影响。模型M是基于类的n-gram模型，已显示出在许多领域上明显优于基于单词的n-gram模型。在过去的工作中，使用[2]的算法从未标记的文本中自动归纳出模型M的单词类别。我们仔细研究分类，并尝试找出改进的分类是否还会转化为改进的性能。特别是，我们在考虑硬分类和软分类的情况下，探索了如何使用手动分配的类，词性（POS）标签和对话状态信息。在对话式对话系统（人机对话）和语音对语音翻译系统（人-人对话）的实验中，我们发现更好的分类可以将Model M的性能提高高达3％的绝对绝对错误率。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2010》|2011年|p.1804-1807|共4页
会议地点
作者
Ruhi Sarikaya; Stanley F. Chen; Abhinav Sethy; Bhuvana Ramabhadran;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
word classing; exponential models; model M;

机译：单词分类指数模型M型;

相似文献

外文文献
中文文献
专利

1. A Two-Level Recurrent Neural Network Language Model Based on the Continuous Bag-of-Words Model for Sentence Classification [J] . Lee Yo Han, Kim Dong W., Lim Myo Taeg International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2019,第1期

机译：一种基于句子分类连续袋式模型的两级反复性神经网络语言模型
2. Combining Language Modeling and LSA on Greek Song "Words" for Mood Classification [J] . Katia L. Kermanidis, Ioannis Karydis, Antonis Koursoumis, International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2014,第2期

机译：将语言建模和LSA结合在希腊歌曲“单词”上进行情绪分类
3. Language modelling for Russian and English using words and classes [J] . E.W.D. Whittaker, P.C. Woodland Computer speech and language . 2003,第1期

机译：使用单词和类对俄语和英语进行语言建模
4. Impact of Word Classing on Shrinkage-Based Language Models [C] . Ruhi Sarikaya, Stanley F. Chen, Abhinav Sethy, Annual conference of the International Speech Communication Association . 2010

机译：语篇对基于收缩语言模型的影响
5. Connecting Documents, Words, and Languages Using Topic Models [D] . Yang, Weiwei. 2019

机译：使用主题模型连接文档，单词和语言
6. Mark My Words: High Frequency Marker Words Impact Early Stages of Language Learning [O] . Rebecca L. A. Frost, Padraic Monaghan, Morten H. Christiansen -1

机译：标记我的词：高频标记词影响语言学习的早期阶段
7. Scaling Shrinkage-Based Language Models [O] . Stanley F. Chen, Lidia Mangu, Bhuvana Ramabhadran, 2010

机译：基于缩放比例的语言模型

Impact of Word Classing on Shrinkage-Based Language Models

摘要

著录项

相似文献

相关主题

期刊订阅