A comparison of part of speech taggers in the task of changing to anew domain

机译：比较部分语音标记器在更改为语音任务中的任务新域名

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Part-of-speech tagging in real-world applications is performed ontext in domains which are different from the publicly available largetraining data sets. The two most successful part-of-speech taggers aretrained on the Wall Street Journal corpus, a corpus of millions ofwords. We compare their performance on a test set from a differentdomain-astronomy-from documents that are available on the World WideWeb. The Maximum Entropy Part of Speech Tagger (MXPOST) and theTransformation-Based Learning Tagger are well-known and widely used inlanguage research and development systems. The two taggers were testedin several modes: (1) after training on the Wall Street Journal corpusonly, (2) after training on only a small body of text from our astronomydomain, (3) with and without an auxiliary lexicon derived from manyastronomy-related Web documents, and (4) after incremental training-thatis, having been trained on the Wall Street Journal, with additionaltraining from the specific domain. One conclusion from the experiment isthat different taggers exhibit different biases when trained on the samedata

机译：实际应用中的词性标记是在域中的文本与公开的大文本不同训练数据集。两种最成功的词性标注器是在《华尔街日报》语料库上接受培训，该语料库是数以百万计的字。我们将它们在不同测试集上的性能进行比较领域天文学-来自全球范围内可用的文档网络。语音标注器的最大熵部分（MXPOST）和基于转换的学习标记器是众所周知的，并在语言研究和开发系统。两种标记器均经过测试有以下几种模式：（1）在接受《华尔街日报》语料库培训后仅（2）在仅训练了来自天文学的一小段文字之后域，（3）有或没有从许多派生的辅助词典与天文学有关的Web文档，以及（4）经过逐步培训后，是，接受过《华尔街日报》的培训，另外还有从特定领域进行培训。实验得出的一个结论是在相同的训练下，不同的标记者表现出不同的偏见数据

著录项

来源
《Information Intelligence and Systems, 1999. Proceedings. 1999 International Conference on》||p.574-578|共5页
会议地点
作者
Boggess L.; Hamaker J.S.; Duncan R.; Klimek L.; Yufeng Wu; Yu Zeng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Speech Translation System Applied to a Real-World Task/Domain and Its Evaluation Using Real-World Speech Data [J] . Atsushi Nakamura, Masaki Naito, Hajime Tsukada IEICE Transactions on Information and Systems . 2001,第1期

机译：应用于实际任务/域的语音翻译系统及其使用真实语音数据的评估
2. Irrelevant speech effect in open plan offices: Comparison of two models explaining the decrease in performance by speech intelligibility and attempt to reduce interindividual differences of the mental workload by task customisation [J] . Applied Acoustics . 2020,第Apra期

机译：开放式办公室中不相关的语音效果：比较两种解释语音清晰度会降低性能并尝试通过任务自定义减少心理工作量个体差异的模型
3. Comparison of Speech Features on the Speech Recognition Task | Science Publications [J] . Iosif Mporas, Mihalis Siafarikas, Nikos Fakotakis, Journal of computer sciences . 2007,第8期

机译：语音识别任务中语音特征的比较科学出版物
4. A comparison of part of speech taggers in the task of changing to a new domain [C] . Boggess, L., Hamaker, . 1999

机译：比较部分语音标记器在更改新域中的任务
5. #MPLP: A comparison of domain novice and expert user-generated tags in a minimally processed digital archive [D] . Benoit, Edward, III 2014

机译：#MPLP：在经过最少处理的数字档案中比较域新手和专家用户生成的标签
6. Frequency of speech disruptions in Parkinson's Disease and developmental stuttering: A comparison among speech tasks [O] . Fabiola Staróbole Juste, Fernanda Chiarion Sassi, Julia Biancalana Costa, 2012

机译：帕金森氏病的言语干扰频率和发展性口吃：言语任务之间的比较
7. Mining for unambiguous instances to adapt part-of-speech taggers to new domains [O] . Dirk Hovy, Barbara Plank, Héctor Martínez Alonso, 2015

机译：挖掘明确的实例，以适应语音部分的标签到新域名

A comparison of part of speech taggers in the task of changing to anew domain

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅