首页> 外文期刊>The Electronic Library >Automatic prediction of news intent for search queries: An exploration of contextual and temporal features
【24h】

Automatic prediction of news intent for search queries: An exploration of contextual and temporal features

机译:自动预测搜索查询的新闻意图:探索上下文和时间特征

获取原文
获取原文并翻译 | 示例
       

摘要

Purpose - The purpose of this paper is to predict news intent by exploring contextual and temporal features directly mined from a general search engine query log. Design/methodology/approach - First, a ground-truth data set with correctly marked news and non-news queries was built Second, a detailed analysis of the search goals and topics distribution of newson-news queries was conducted. Third, three news features, that is, the relationship between entity and contextual words extended from query sessions, topical similarity among clicked results and temporal burst point were obtained. Finally, to understand the utilities of the new features and prior features, extensive prediction experiments on SogouQ (a Chinese search engine query log) were conducted. Findings - News intent can be predicted with high accuracy by using the proposed contextual and temporal features, and the macro average F1 of classification is around 0.8677. Contextual features are more effective than temporal features. All the three new features are useful and significant in improving the accuracy of news intent prediction. Originality/value - This paper provides a new and different perspective in recognizing queries with news intent without use of such large corpora as social media (e.g. Wikipedia, Twitter and blogs) and news data sets. The research will be helpful for general-purpose search engines to address search intents for news events. In addition, the authors believe that the approaches described here in this paper are general enough to apply to other verticals with dynamic content and interest, such as blog or financial data.
机译:目的-本文的目的是通过探索直接从一般搜索引擎查询日志中提取的上下文和时间特征来预测新闻意图。设计/方法/方法-首先,建立具有正确标记的新闻和非新闻查询的真实数据集,其次,对新闻/非新闻查询的搜索目标和主题分布进行详细分析。第三,获得了三个新闻特征,即实体和从查询会话扩展的上下文词之间的关系,点击结果之间的主题相似性和时间突发点。最后,为了了解新功能和先前功能的实用性,对SogouQ(中文搜索引擎查询日志)进行了广泛的预测实验。调查结果-通过使用建议的上下文和时间特征可以准确预测新闻意图,分类的宏观平均F1约为0.8677。上下文特征比时间特征更有效。所有这三个新功能在提高新闻意图预测的准确性方面都非常有用且意义重大。原创性/价值-本文为识别具有新闻意图的查询提供了一种全新的视角,而无需使用诸如社交媒体(例如Wikipedia,Twitter和博客)和新闻数据集之类的大型语料库。该研究将有助于通用搜索引擎解决新闻事件的搜索意图。此外,作者认为,本文介绍的方法足够通用,可应用于具有动态内容和兴趣的其他垂直行业,例如博客或财务数据。

著录项

  • 来源
    《The Electronic Library》 |2018年第5期|938-958|共21页
  • 作者单位

    Department of Computer and Information Science, Southwest University, Chongqing, China;

    Department of Information Science, University of Pittsburgh, Pittsburgh, PA, USA;

    Department of Information Management, Wuhan University, Hubei, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Query classification; News intent; News queries; Query intent;

    机译:查询分类;新闻意图;新闻查询;查询意图;
  • 入库时间 2022-08-18 04:10:57

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号