首页> 外文学位 >Measuring the interestingness of articles in a limited user environment.
【24h】

Measuring the interestingness of articles in a limited user environment.

机译:在有限的用户环境中衡量文章的趣味性。

获取原文
获取原文并翻译 | 示例

摘要

Search engines, such as Google, assign scores to news articles based on their relevancy to a query. However, not all relevant articles for the query may be interesting to a user. For example, if the article is old or yields little new information, the article would be uninteresting. Relevancy scores do not take into account what makes an article interesting, which varies from user to user. Although methods such as collaborative filtering have been shown to be effective in recommendation systems, in a limited user environment, there are not enough users that would make collaborative filtering effective.;A general framework, called iScore, is presented for defining and measuring the "interestingness" of articles, incorporating user-feedback. iScore addresses various aspects of what makes an article interesting, such as topic relevancy, uniqueness, freshness, source reputation, and writing style. It employs various methods to measure these features and uses a classifier operating on these features to recommend articles. The basic iScore configuration is shown to improve recommendation results by as much as 20%. In addition to the basic iScore features, additional features are presented to address the deficiencies of existing feature extractors, such as one that tracks multiple topics, called MTT, and a version of the Rocchio algorithm that learns its parameters online as it processes documents, called eRocchio. The inclusion of both MTT and eRocchio into iScore is shown to improve iScore recommendation results by as much as 3.1% and 5.6%, respectively. Additionally, in TREC11 Adaptive Filter Task, eRocchio is shown to be 10% better than the best filter in the last run of the task.;In addition to these two major topic relevancy measures, other features are also introduced that employ language models, phrases, clustering, and changes in topics to improve recommendation results. These additional features are shown to improve recommendation results by iScore by up to 14%. Due to varying reasons that users hold regarding why an article is interesting, an online feature selection method in naive Bayes is also introduced. Online feature selection can improve recommendation results in iScore by up to 18.9%.;In summary, iScore in its best configuration can outperform traditional IR techniques by as much as 50.7%. iScore and its components are evaluated in the news recommendation task using three datasets from Yahoo! News, actual users, and Digg. iScore and its components are also evaluated in the TREC Adaptive Filter task using the Reuters RCV1 corpus.
机译:诸如Google之类的搜索引擎会根据新闻文章与查询的相关性为其分配分数。但是,并非所有与查询相关的文章都可能对用户感兴趣。例如,如果该文章是旧的或产生很少的新信息,则该文章将是无趣的。关联性得分未考虑使文章变得有趣的因素,这因用户而异。尽管已显示出诸如协作过滤之类的方法在推荐系统中是有效的,但是在有限的用户环境中,没有足够的用户能够使协作过滤有效。;提出了一个通用框架,称为iScore,用于定义和衡量“有趣的文章”,并结合了用户反馈。 iScore解决了使文章变得有趣的各个方面,例如主题相关性,唯一性,新鲜度,源信誉和写作风格。它采用各种方法来衡量这些功能,并使用对这些功能进行操作的分类器来推荐文章。基本的iScore配置显示可将推荐结果提高多达20%。除了基本的iScore功能之外,还提供了其他功能来解决现有功能提取器的不足,例如,一个跟踪多个主题的功能称为MTT,以及一种Rocchio算法的版本,该功能在处理文档时可以在线学习其参数,称为eRocchio。显示iScore中同时包含MTT和eRocchio可使iScore推荐结果分别提高3.1%和5.6%。此外,在TREC11自适应过滤器任务中,eRocchio被证明比该任务的最后一次运行中的最佳过滤器好10%。;除了这两个主要主题相关性度量之外,还引入了使用语言模型,短语的其他功能。 ,聚类和主题更改,以改善推荐结果。这些附加功能显示可将iScore的推荐结果提高多达14%。由于用户持有关于文章为何感兴趣的各种原因,因此还引入了朴素贝叶斯的在线特征选择方法。在线功能选择可以使iScore的推荐结果提高多达18.9%。总而言之,以最佳配置运行的iScore可以比传统IR技术高50.7%。使用Yahoo!的三个数据集在新闻推荐任务中评估iScore及其组件。新闻,实际用户和Digg。还使用Reuters RCV1语料库在TREC自适应过滤器任务中评估iScore及其组件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号