Minimizing Training Corpus for Parser Acquisition

机译：最小化训练语料库以进行解析器获取

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Many corpus-based natural language processing systems rely on using large quantities of annotated text as their training examples. Building this kind of resource is an expensive and labor-intensive project. To minimize effort spent on annotating examples that are not helpful the training process., recent research efforts have begun to apply active learning techniques to selectively choose data to be annotated. In this work, we consider selecting training examples with the it tree-entropy metric. Our goal is to assess how well this selection technique can be applied for training different types of parsers. We find that tree-entropy can significantly reduce the amount of training annotation for both a history-based parser and an EM-based parser. Moreover, the examples selected for the history-based parser are also good for training the EM-based parser, suggesting that the technique is parser independent.

著录项

作者
Hwa, R.;
展开▼
作者单位

展开▼
年度 2001
页码 1-8
总页数 8
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Parsers; Acquisition; Training; Learning; Natural language;

机译：解析器;习得;训练;学习;自然语言;

相似文献

外文文献
中文文献
专利

1. Extracting Partial Parsing Rules from Tree-Annotated Corpus: Toward Deterministic Global Parsing [J] . Myung-Seok CHOI, Kong-Joo LEE, Key-Sun CHOI, IEICE Transactions on Information and Systems . 2005,第6期

机译：从带树注释的语料库中提取部分解析规则：走向确定性全局解析
2. Syntactic parsing of clinical text: Guideline and corpus development with handling ill-formed sentences [J] . FanJ.-W., YangE.W., JiangM., Journal of the American Medical Informatics Association : . 2013,第6期

机译：临床文本的句法解析：指导和语料库开发，处理不正确的句子
3. Evaluation of two dependency parsers on biomedical corpus targeted at protein—protein interactions [J] . Sampo Pyysalo, Filip Ginter, Tapio Pahikkala, International journal of medical informatics . 2006,第6期

机译：针对针对蛋白质-蛋白质相互作用的生物医学语料库的两个依赖性解析器的评估
4. On Minimizing Training Corpus for Parser Acquisition [C] . Rebecca Hwa Association for Computational Linguistics 39th annual meeting and 10th conference of the European chapter . 2001

机译：关于最小化分析器获取的训练语料库
5. Spoken Corpus-based Resources for Undergraduate Initial Interpreter Training and Lexical Knowledge Acquisition: Empirical Case Studies. [D] . Bale, Richard. 2013

机译：基于口语库的资源，用于本科生初始口译员培训和词汇知识获取：经验案例研究。
6. Research and applications: Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences [O] . Jung-wei Fan, Elly W Yang, Min Jiang, 2013

机译：研究与应用：临床文本的句法解析：处理不正确句子的指南和语料库开发
7. On Minimizing Training Corpus for Parser Acquisition [O] . Rebecca Hwa 2001

机译：关于最小化训练语料库进行分析习得的思考

Minimizing Training Corpus for Parser Acquisition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅