首页> 外文会议>ACM conference on information and knowledge management >Online Annotation of Text Streams With Structured Entities
【24h】

Online Annotation of Text Streams With Structured Entities

机译:在线注释由结构化实体的文本流

获取原文

摘要

We propose a framework and algorithm for annotating unbounded text streams with entities of a structured database. The algorithm allows one to correlate unstructured and dirty text streams from sources such as emails, chats and blogs, to entities stored in structured databases. In contrast to previous work on entity extraction, our emphasis is on performing entity annotation in a completely online fashion. The algorithm continuously extracts important phrases and assigns to them top-fc relevant entities. Our algorithm does so with a guarantee of constant time and space complexity for each additional word in the text stream, thus infinite text streams can be annotated. Our framework allows the online annotation algorithm to adapt to changing stream rate by self-adjusting multiple run-time parameters to reduce or improve the quality of annotation for fast or slow streams, respectively. The framework also allows the online annotation algorithm to incorporate query feedback to learn user preferences and personalize the annotation for individual users.
机译:我们提出了一种框架和算法,用于使用结构化数据库的实体注释无界文本流。该算法允许人们将非结构化和脏文本流相关联,诸如电子邮件,聊天和博客,存储在结构化数据库中的实体。与以前的实体提取工作相比,我们的重点是以完全在线的方式执行实体注释。该算法连续提取重要的短语并分配给他们的顶部FC相关实体。我们的算法在文本流中的每个附加单词的保证时确保了恒定的时间和空间复杂度,因此可以注释无限的文本流。我们的框架允许在线注释算法通过自调整多个运行时参数来调整流速率,以分别减少或提高快速或慢速流的注释质量。该框架还允许在线注释算法合并查询反馈以了解用户首选项,并个性化单个用户的注释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号