...
首页> 外文期刊>Knowledge-Based Systems >Beyond cluster labeling: Semantic interpretation of clusters' contents using a graph representation
【24h】

Beyond cluster labeling: Semantic interpretation of clusters' contents using a graph representation

机译:超越集群标签:使用图形表示法对集群内容进行语义解释

获取原文
获取原文并翻译 | 示例
           

摘要

Efficient clustering algorithms have been developed to automatically group documents into subgroups (clusters). Once clustering has been performed, an important additional step is to help users make sense of the obtained clusters. Existing methods address this issue by assigning to each cluster a flat list of descriptive terms (labels) that are extracted from the documents, most often using statistical techniques borrowed from the field of feature selection or reduction. A limitation of these unstructured descriptions of clusters' contents is that they do not account for the meaningful relationships between the terms. In contrast, we propose a graph representation, which makes the clusters easier to interpret by putting the descriptive terms in context, and by performing some simple network analysis. Our experiments reveal that the proposed method allows for a deeper level of interpretation, both when the clusters under study are homogeneous and when they are heterogeneous. In addition, evaluation procedures presented in the paper show that the graph-based representation of each cluster, while being very synthetic, still quite faithfully reflects the original content of the cluster.
机译:已经开发出了有效的聚类算法,可以将文档自动分组为子组(集群)。一旦执行了聚类,一个重要的附加步骤就是帮助用户理解所获得的聚类。现有方法通过为每个聚类分配从文档中提取的描述性术语(标签)的平面列表来解决此问题,最常用的方法是使用从特征选择或归约领域借用的统计技术。这些对集群内容的非结构化描述的局限性在于它们没有考虑到术语之间的有意义的关系。相反,我们提出了一种图形表示法,通过将描述性术语置于上下文中并执行一些简单的网络分析,可以使群集更易于解释。我们的实验表明,无论所研究的聚类是同质的还是异类的,提出的方法都可以进行更深层次的解释。此外,本文提出的评估程序表明,每个聚类的基于图的表示形式虽然非常综合,但仍非常忠实地反映了聚类的原始内容。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号