首页> 外文OA文献 >Query-oriented Clustering: A Multi-objective Approach
【2h】

Query-oriented Clustering: A Multi-objective Approach

机译:面向查询的聚类:一种多目标方法

摘要

Document clustering techniques have been widely applied in Information Retrieval to reorganize results furnished as a response to useru27s queries. Following the Cluster Hypothesis which states that relevant documents tend to be more similar one to each other than to non-relevant ones, most of relevant documents are likely to be gathered in a single cluster. Usually, systems organizing search results as a set of clusters consider this tendency as a very advantageous phenomenon, since it allows to filter the results provided by the initial search. Adopting a different point of view, we rather consider the Cluster Hypothesis as a hindrance to the information access since it prevents the emergence of the various aspects of the query. The risk induced is to restrict the perception of the subject to an unique point of view. Therefore, we propose to rather distribute the relevant documents over clusters by orienting the organization of the clusters according to the useru27s topic. The aim is to attract the clusters around the latter in order to highlight the thematic differences between documents which are strongly connected to the query. Rather than modifying the inter-documents similarity computation as it is the case in several studies, we propose to directly act on the organization of the clusters by using a multi-objective evolutionary clustering algorithm which, besides the classical internal cohesion, also optimizes the query proximity of the clusters. First experimental results highlight the great benefit which may be gained by our way of query consideration.
机译:文档聚类技术已广泛应用于信息检索中,以重组作为对用户查询的响应而提供的结果。遵循集群假说,集群假说指出相关文档之间的相似性比不相关文档更高,大多数相关文档很可能集中在一个聚类中。通常,将搜索结果组织为一组聚类的系统将这种趋势视为一种非常有利的现象,因为它可以过滤初始搜索提供的结果。采用不同的观点,我们宁愿将集群假说视为信息访问的障碍,因为它阻止了查询各个方面的出现。诱发的风险是将对象的感知限制在唯一的角度。因此,我们建议通过根据用户主题确定集群的组织结构,从而将相关文档分配到集群中。目的是吸引后者周围的集群,以突出显示与查询紧密相关的文档之间的主题差异。我们建议不要使用多目标进化聚类算法直接作用于聚类的组织,而不是像一些研究中那样修改文档间的相似性计算,该算法除了经典的内聚性之外,还可以优化查询集群的接近度。最初的实验结果突出了通过我们考虑查询的方式可能获得的巨大好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号