The crowd can be an incredible source of information. In particular, this is true for reviews about products of any kind, freely provided by customers through specialized web sites. In other words, they are social knowledge, that can be exploited by other customers. The Hints From the Crowd (HFC) prototype, presented in this paper, is a NoSQL database system for large collections of product reviews; the database is queried by expressing a natural language sentence; the result is a list of products ranked based on the relevance of reviews w.r.t. the natural language sentence. The best ranked products in the result list can be seen as the best hints for the user based on crowd opinions (the reviews). In this paper, we mainly describe the query engine, and we show that our prototype obtains good performance in terms of execution time, demonstrating that our approach is feasible. The IMDb dataset, that includes more than 2 million reviews for more than 100,000 movies, is used to evaluate performance.
展开▼
机译:人群可以是一个令人难以置信的信息来源。特别是,对于客户通过专业网站免费提供的任何类型的产品的评论,都是如此。换句话说,它们是社交知识,可以被其他客户利用。本文中介绍的 Hints From the Crowd (HFC) 原型是一个用于大量产品评论的 NoSQL 数据库系统;通过表达自然语言句子来查询数据库;结果是根据评论与自然语言句子的相关性进行排名的产品列表。结果列表中排名最高的产品可以被视为基于人群意见(评论)对用户的最佳提示。在本文中,我们主要描述了查询引擎,并展示了我们的原型在执行时间方面获得了良好的性能,证明了我们的方法是可行的。IMDb 数据集包括 100,000 多部电影的 200 多万条评论,用于评估效果。
展开▼