首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm
【24h】

Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm

机译:使用新型模糊关系聚类算法的句子级文本聚类

获取原文
获取原文并翻译 | 示例

摘要

In comparison with hard clustering methods, in which a pattern belongs to a single cluster, fuzzy clustering algorithms allow patterns to belong to all clusters with differing degrees of membership. This is important in domains such as sentence clustering, since a sentence is likely to be related to more than one theme or topic present within a document or set of documents. However, because most sentence similarity measures do not represent sentences in a common metric space, conventional fuzzy clustering approaches based on prototypes or mixtures of Gaussians are generally not applicable to sentence clustering. This paper presents a novel fuzzy clustering algorithm that operates on relational input data; i.e., data in the form of a square matrix of pairwise similarities between data objects. The algorithm uses a graph representation of the data, and operates in an Expectation-Maximization framework in which the graph centrality of an object in the graph is interpreted as a likelihood. Results of applying the algorithm to sentence clustering tasks demonstrate that the algorithm is capable of identifying overlapping clusters of semantically related sentences, and that it is therefore of potential use in a variety of text mining tasks. We also include results of applying the algorithm to benchmark data sets in several other domains.
机译:与其中模式属于单个群集的硬聚类方法相比,模糊聚类算法允许模式属于具有不同隶属度的所有聚类。这在诸如句子聚类之类的领域中很重要,因为一个句子很可能与一个文档或一组文档中存在的多个主题或主题相关。但是,由于大多数句子相似性度量不代表公共度量空间中的句子,因此基于高斯原型或混合的传统模糊聚类方法通常不适用于句子聚类。本文提出了一种新的基于关系输入数据的模糊聚类算法。即数据对象之间成对相似的方阵形式的数据。该算法使用数据的图形表示形式,并在Expectation-Maximization框架中运行,在该框架中,对象在图形中的图形中心被解释为可能性。将算法应用于句子聚类任务的结果表明,该算法能够识别语义相关的句子的重叠聚类,因此在各种文本挖掘任务中都有潜在的用途。我们还包括将算法应用于其他几个领域的基准数据集的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号