首页> 外文会议>International conference on database and expert systems applications >A Linguistic Graph-Based Approach for Web News Sentence Searching
【24h】

A Linguistic Graph-Based Approach for Web News Sentence Searching

机译:基于语言图的Web新闻句子搜索方法

获取原文

摘要

With an ever increasing amount of news being published every day, being able to effectively search these vast amounts of information is of primary interest to many Web ventures. As word-based approaches have their limits in that they ignore a lot of the information in texts, we present Destiny, a linguistic approach where news item sentences are represented as a graph featuring disambiguated words as nodes and grammatical relations between words as edges. Searching is then reminiscent of finding an approximate sub-graph isomorphism between the query sentence graph and the graphs representing the news item sentences, exploiting word synonymy, word hypernymy, and sentence grammar. Using a custom corpus of user-rated queries and sentences, the search algorithm is evaluated based on the Mean Average Precision, Spearman's Rho, and the normalized Discounted Cumulative Gain. Compared to the TF-IDF baseline, the Destiny algorithm performs significantly better on these metrics.
机译:随着每天发布的新闻数量不断增加,有效地搜索这些大量信息成为许多网络企业的主要兴趣所在。由于基于单词的方法有其局限性,因为它们忽略了文本中的许多信息,因此我们提出了“命运”这一语言方法,其中新闻条目的句子表示为一个图形,以歧义的单词作为节点,单词之间的语法关系作为边缘。然后,搜索使人想起要在查询语句图和代表新闻条目语句的图之间找到近似的子图同构,利用单词同义词,单词上位字母和句子语法。使用用户评分的查询和句子的自定义语料库,基于平均平均精度,Spearman的Rho和归一化的折扣累积增益对搜索算法进行评估。与TF-IDF基线相比,Destiny算法在这些指标上的表现明显更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号