Mining unstructured content for recommender systems: an ensemble approach

Manzato Marcelo G.; Domingues Marcos A.; Fortes Arthur C.; Sundermann Camila V.; DAddio Rafael M.; Conrado Merley S.; Rezende Solange O.; Pimentel Maria G. C.

首页> 外文期刊>Information retrieval >Mining unstructured content for recommender systems: an ensemble approach

【24h】

Mining unstructured content for recommender systems: an ensemble approach

机译：挖掘推荐系统的非结构化内容：整体方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recommendation of textual documents requires indexing mechanisms to extract structured metadata for attribute-aware recommender systems. Applying a variety of text mining algorithms has the advantage of capturing different aspects of unstructured content, resulting in richer descriptions. However, it is difficult to integrate them into a unique model so that these descriptions can efficiently improve recommendation accuracy. This article proposes a generic model based on ensemble learning that combines simple text mining methods in a post-processing approach. After executing each text mining technique, each set of metadata of a particular type is applied to the recommender module, which generates attribute-specific rankings. Then, the resulting recommendations are ensembled to generate a final personalized ranking to the user. We evaluated our ensemble technique with two attribute-aware collaborative recommenders (k-Nearest Neighbors and BPR-Mapping) and we demonstrate its generality by means of comparisons among different types of ensembles. We used two datasets from different domains, the first is from the Brazilian Embrapa Agency of Technology Information website, whose documents are written in Portuguese language, and the second is the HetRec MovieLens 2k, published by the GroupLens Research Group, whose movies' storylines are written in English. The experiments show that, particularly to the k-NN recommender, better accuracy can be obtained when multiple metadata types are combined. The proposed approach is extensible and flexible to new indexing and recommendation techniques.

机译：文本文档的推荐需要索引机制来提取属性识别的推荐器系统的结构化元数据。应用各种文本挖掘算法的优势在于可以捕获非结构化内容的不同方面，从而获得更丰富的描述。但是，很难将它们集成到唯一的模型中，以便这些描述可以有效地提高推荐的准确性。本文提出了一种基于整体学习的通用模型，该模型在后处理方法中结合了简单的文本挖掘方法。在执行每种文本挖掘技术之后，每种特定类型的元数据集都将应用于推荐程序模块，该模块会生成特定于属性的排名。然后，将所得到的推荐进行汇总以为用户生成最终的个性化排名。我们与两个属性感知协作推荐器（k最近邻和BPR映射）一起评估了集成技术，并通过比较不同类型的集成体来证明其通用性。我们使用了来自不同领域的两个数据集，第一个是来自巴西Embrapa技术信息网站的数据集，其文档是用葡萄牙语编写的，第二个是由GroupLens Research Group发布的HetRec MovieLens 2k，其电影的故事情节是用英语写。实验表明，特别是对于k-NN推荐器，组合多种元数据类型时可以获得更好的准确性。所提出的方法是可扩展的，并且对于新的索引编制和推荐技术具有灵活性。

著录项

来源
《Information retrieval》 |2016年第4期|378-415|共38页
作者
Manzato Marcelo G.; Domingues Marcos A.; Fortes Arthur C.; Sundermann Camila V.; DAddio Rafael M.; Conrado Merley S.; Rezende Solange O.; Pimentel Maria G. C.;
展开▼
作者单位

Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil;

Univ Estadual Maringa, Dept Informat, Maringa, Parana, Brazil;

Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil;

Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil;

Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil;

Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil;

Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil;

Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Recommender systems; Ensemble learning; Personalized ranking; Metadata awareness; Unstructured content;

机译：推荐系统;集成学习;个性化排名;元数据意识;非结构化内容;

相似文献

外文文献
中文文献
专利

1. Semantic Web mining for Content-Based Online Shopping Recommender Systems [J] . International Journal of Intelligent Information Technologies . 2019,第4期

机译：基于内容的在线购物推荐系统的语义网络挖掘
2. Web usage mining based recommender systems using implicit heterogeneous data: A Particle Swarm Optimization based clustering approach [J] . Shafiq Alam, Gillian Dobbie, Yun Sing Koh, Web Intelligence and Agent Systems . 2014,第4期

机译：使用隐式异构数据的基于Web使用挖掘的推荐系统：基于粒子群优化的聚类方法
3. Data quality in recommender systems: the impact of completeness of item content data on prediction accuracy of recommender systems [J] . Heinrich Bernd, Hopf Marcus, Lohninger Daniel, Electronic Markets . 2021,第2期

机译：推荐系统中的数据质量：项目内容数据完整性对推荐系统的预测准确性的影响
4. A System for Unstructured Data Mining using Dynamic Ensemble Selection [C] . Raquel Bezerra Calado, Leandro Sigfredo Rodriguez Torres, Alexandre M. A. Maciel IEEE International Conference on Systems, Man, and Cybernetics . 2020

机译：使用动态集合选择的非结构化数据挖掘系统
5. Mining for meaning: The use of unstructured textual data in information systems research. [D] . Triplett, Janea Lynne. 2012

机译：挖掘意义：在信息系统研究中使用非结构化文本数据。
6. The 3 Cs of Content Context and Concepts: A Practical Approach to Recording Unstructured Field Observations [O] . Michael D. Fetters, Ellen B. Rubinstein 2019

机译：内容上下文和概念的3 C：记录非结构化实地观测的实用方法
7. Semantic Web mining for Content-Based Online Shopping Recommender Systems [O] . Ibukun Tolulope Afolabi, Opeyemi Samuel Makinde, Olufunke Oyejoke Oladipupo 2019

机译：基于内容的在线购物推荐系统的语义网络挖掘

Mining unstructured content for recommender systems: an ensemble approach

摘要

著录项

相似文献

相关主题

期刊订阅