首页> 美国政府科技报告 >Comparing Evaluation Metrics for Sentence Boundary Detection
【24h】

Comparing Evaluation Metrics for Sentence Boundary Detection

机译:句子边界检测的评价指标比较

获取原文

摘要

In recent NIST evaluations on sentence boundary detection, a single error metric was used to describe performance. Additional metrics, however, are available for such tasks, in which a word stream is partitioned into subunits. This paper compares alternative evaluation metrics including the NIST error rate, classification error rate per word boundary, precision and recall, ROC curves, DET curves, precision-recall curves, and area under the curves and discusses advantages and disadvantages of each. Unlike many studies in machine learning, we use real data for a real task. We find benefit from using curves in addition to a single metric. Furthermore, we find that data skew has an impact on metrics, and that differences among different system outputs are more visible in precision-recall curves. Results are expected to help us better understand evaluation metrics that should be generalizable to similar language processing tasks.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号