首页> 外文期刊>Information retrieval >Mining unstructured content for recommender systems: an ensemble approach
【24h】

Mining unstructured content for recommender systems: an ensemble approach

机译:挖掘推荐系统的非结构化内容:整体方法

获取原文
获取原文并翻译 | 示例
           

摘要

Recommendation of textual documents requires indexing mechanisms to extract structured metadata for attribute-aware recommender systems. Applying a variety of text mining algorithms has the advantage of capturing different aspects of unstructured content, resulting in richer descriptions. However, it is difficult to integrate them into a unique model so that these descriptions can efficiently improve recommendation accuracy. This article proposes a generic model based on ensemble learning that combines simple text mining methods in a post-processing approach. After executing each text mining technique, each set of metadata of a particular type is applied to the recommender module, which generates attribute-specific rankings. Then, the resulting recommendations are ensembled to generate a final personalized ranking to the user. We evaluated our ensemble technique with two attribute-aware collaborative recommenders (k-Nearest Neighbors and BPR-Mapping) and we demonstrate its generality by means of comparisons among different types of ensembles. We used two datasets from different domains, the first is from the Brazilian Embrapa Agency of Technology Information website, whose documents are written in Portuguese language, and the second is the HetRec MovieLens 2k, published by the GroupLens Research Group, whose movies' storylines are written in English. The experiments show that, particularly to the k-NN recommender, better accuracy can be obtained when multiple metadata types are combined. The proposed approach is extensible and flexible to new indexing and recommendation techniques.
机译:文本文档的推荐需要索引机制来提取属性识别的推荐器系统的结构化元数据。应用各种文本挖掘算法的优势在于可以捕获非结构化内容的不同方面,从而获得更丰富的描述。但是,很难将它们集成到唯一的模型中,以便这些描述可以有效地提高推荐的准确性。本文提出了一种基于整体学习的通用模型,该模型在后处理方法中结合了简单的文本挖掘方法。在执行每种文本挖掘技术之后,每种特定类型的元数据集都将应用于推荐程序模块,该模块会生成特定于属性的排名。然后,将所得到的推荐进行汇总以为用户生成最终的个性化排名。我们与两个属性感知协作推荐器(k最近邻和BPR映射)一起评估了集成技术,并通过比较不同类型的集成体来证明其通用性。我们使用了来自不同领域的两个数据集,第一个是来自巴西Embrapa技术信息网站的数据集,其文档是用葡萄牙语编写的,第二个是由GroupLens Research Group发布的HetRec MovieLens 2k,其电影的故事情节是用英语写。实验表明,特别是对于k-NN推荐器,组合多种元数据类型时可以获得更好的准确性。所提出的方法是可扩展的,并且对于新的索引编制和推荐技术具有灵活性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号