首页> 美国政府科技报告 >Facilitating Treebank Annotation Using a Statistical Parser
【24h】

Facilitating Treebank Annotation Using a Statistical Parser

机译:利用统计分析器促进树库注释

获取原文

摘要

Corpora of phrase-structure-annotated text, or treebanks, are useful for supervised training of statistical models for natural language processing, as well as for corpus linguistics. Their primary drawback, however, is that they are very time-consuming to produce. To alleviate this problem, the standard approach is to make two passes over the text: first, parse the text automatically, then correct the parser output by hand. In this paper we explore three questions: How much does an automatic first pass speed up annotation. Does this automatic first pass affect the reliability of the final product. What kind of parser is best suited for such an automatic first pass. We investigate these questions by an experiment to augment the Penn Chinese Treebank using a statistical parser developed by Chiang for English. This experiment differs from previous efforts in two ways: first, we quantify the increase in annotation speed provided by the automatic first pass (70 100%); second, we use a parser developed on one language to augment a corpus in an unrelated language.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号