首页> 外文会议>Annual meeting of the Association for Computational Linguistics;ACL 2012 >Labeling Documents with Timestamps: Learning from their Time Expressions
【24h】

Labeling Documents with Timestamps: Learning from their Time Expressions

机译:使用时间戳标记文档:从其时间表达中学习

获取原文

摘要

Temporal reasoners for document understanding typically assume that a document's creation date is known. Algorithms to ground relative time expressions and order events often rely on this timestamp to assist the learner. Unfortunately, the timestamp is not always known, particularly on the Web. This paper addresses the task of automatic document timestamping, presenting two new models that incorporate rich linguistic features about time. The first is a discriminative classifier with new features extracted from the text's time expressions (e.g., 'since 1999'). This model alone improves on previous generative models by 77%. The second model learns probabilistic constraints between time expressions and the unknown document time. Imposing these learned constraints on the discriminative model further improves its accuracy. Finally, we present a new experiment design that facilitates easier comparison by future work.
机译:用于文档理解的时间推理器通常假定文档的创建日期是已知的。使相对时间表达和顺序事件成为基础的算法通常依赖于此时间戳来帮助学习者。不幸的是,时间戳并不总是已知的,尤其是在Web上。本文解决了自动文档时间戳的任务,提出了两个新模型,这些模型结合了有关时间的丰富语言功能。第一个是具有区别性的分类器,具有从文本的时间表达中提取的新功能(例如,“自1999年以来”)。仅此模型就比以前的生成模型提高了77%。第二个模型学习时间表达式与未知文档时间之间的概率约束。将这些学习到的约束强加给判别模型,可以进一步提高其准确性。最后,我们提出了一种新的实验设计,可以方便将来的工作进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号