首页> 外文会议>International conference on computational linguistics >Identification of Implicit Topics in Twitter Data Not Containing Explicit Search Queries
【24h】

Identification of Implicit Topics in Twitter Data Not Containing Explicit Search Queries

机译:不包含明确搜索查询的Twitter数据中的隐式主题的标识

获取原文

摘要

This study aims at retrieving tweets with an implicit topic, which cannot be identified by the current query-matching system employed by Twitter. Such tweets are relevant to a given query but do not explicitly contain the term. When these tweets are combined with a relevant tweet containing the overt keyword, the "serialized" tweets can be integrated into the same discourse context. To this end, features like reply relation, authorship, temporal proximity, continuation markers, and discourse markers were used to build models for detecting serialization. According to our experiments, each one of the suggested serializing methods achieves higher means of average precision rates than baselines such as the query matching model and the tf-idf weighting model, which indicates that considering an individual tweet within a discourse context is helpful in judging its relevance to a given topic.
机译:这项研究旨在检索带有隐式主题的推文,而推特当前使用的查询匹配系统无法识别这些隐式主题。此类推文与给定查询相关,但不明确包含该词。当这些推文与包含overt关键字的相关推文结合使用时,“序列化”推文可以集成到相同的语境中。为此,使用了诸如答复关系,作者身份,时间邻近性,连续标记和话语标记之类的功能来构建用于检测序列化的模型。根据我们的实验,与查询匹配模型和tf-idf加权模型等基线相比,每种建议的序列化方法均具有更高的平均平均准确率,这表明考虑话语上下文中的单个推文有助于判断它与给定主题的相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号