首页> 外文会议>IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference >Research on open domain Named entity recognition based on Chinese query logs
【24h】

Research on open domain Named entity recognition based on Chinese query logs

机译:基于中文查询日志的开放域名实体识别研究

获取原文

摘要

Search engine query logs contain quantities of Named Entities. As the basic work of information extraction, traditional Named-entity extraction methods only can extract specific categories of entities. It is very difficult for them to be applied to the query log Named-entity recognition directly for their limitation. In this paper, a novel approach is proposed to extract Named Entities from user query logs. In order to avoid the dependence on large-scale tagging corpus, we annotate the data automatically by using distant supervision method. Thus the problem that the training data needs human-annotation effort is solved. Moreover, open domain Named Entities are extracted from user query logs based on the conditional random field model. Evaluation on user query logs shows the effectiveness of our approach in extracting Named Entities in open domain.
机译:搜索引擎查询日志中包含大量的命名实体。传统的命名实体提取方法作为信息提取的基础工作,只能提取特定类别的实体。由于它们的局限性,很难将它们直接应用于查询日志命名实体识别。本文提出了一种从用户查询日志中提取命名实体的新方法。为了避免依赖大规模标注语料库,我们采用远距离监督方法对数据进行自动注释。从而解决了训练数据需要人工标注的问题。此外,基于条件随机字段模型从用户查询日志中提取开放域命名实体。对用户查询日志的评估表明,我们的方法在开放域中提取命名实体的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号