【24h】

A Two-stage Sieve Approach for Quote Attribution

机译:报价归属的两阶段筛选方法

获取原文

摘要

We present a deterministic sieve-based system for attributing quotations in literary text and a new dataset: QuoteLi3. Quote attribution, determining who said what in a given text, is important for tasks like creating dialogue systems, and in newer areas like computational literary studies, where it creates opportunities to analyze novels at scale rather than only a few at a time. We release QuoteLi3, which contains more than 6,000 annotations linking quotes to speaker mentions and quotes to speaker entities, and introduce a new algorithm for quote attribution. Our two-stage algorithm first links quotes to mentions, then mentions to entities. Using two stages encapsulates difficult sub-problems and improves system performance. The modular design allows us to tune either for overall performance or for the high precision appropriate for many use cases. Our system achieves an average F-score of 87.5% across three novels, outperforming previous systems, and can be tuned for precision of 90.4% at a recall of 65.1%.
机译:我们提出了一种基于确定性筛子的系统,用于在文学文本和新数据集中引用语录。引用归因,确定谁在给定的文本中说了什么,对于创建对话系统等任务以及在计算文学研究等较新的领域至关重要,在引用领域中,它创造了大规模分析小说的机会,而不是一次仅分析少数小说。我们发布了QuoteLi3,其中包含6,000多个注释,这些注释将引语链接到说话者的提述,并将引语链接到说话者的实体,并引入了一种新的报价归属算法。我们的两阶段算法首先将引号链接到提及,然后将提及链接到实体。使用两个阶段可以封装困难的子问题并提高系统性能。模块化设计使我们可以针对整体性能或针对许多用例的高精度进行调整。我们的系统在三本小说中的平均F得分达到87.5%,优于以前的系统,并且可以调整为90.4%的精度,召回率为65.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号