首页> 外文期刊>Procedia Computer Science >An Enhanced Fuzzy Clustering and Expectation Maximization Framework based Matching Semantically Similar Sentences
【24h】

An Enhanced Fuzzy Clustering and Expectation Maximization Framework based Matching Semantically Similar Sentences

机译:基于匹配语义相似句的增强型模糊聚类和期望最大化框架

获取原文
           

摘要

Statistical measure of finding Similar Sentences using a novel Fuzzy clustering algorithm framework is developed which organizes text from one or more documents into different clusters. The traditional fuzzy clustering approaches are not applicable to sentence clustering because most sentence similarity measures do not represent sentences in a common metric space. An enhanced Fuzzy clustering algorithm is applied in the sentence of datasets to group the related sentences. Page Rank algorithm highlights the more relevant inter clusters which interprets the Page-Rank score of an object. Expectation- Maximization (EM) framework has been developed in order to predict the overlapping clusters of semantically related sentences. Quotations dataset and News article dataset empirically implies the Similarity measure of matching Semantically Similar Sentences in which our system out performs the baseline method and projection methods. Our proposed method performs 34% higher in similarity scoring of related sentences. It also analyzes the clustering performance in terms of Entropy and Purity which yields more Purity and less Entropy. Our Experimental results demonstrates that our method is capable of identifying the overlapping clusters of semantically related sentences, and can be used in a variety of text mining tasks.
机译:开发了使用新型模糊聚类算法框架查找相似句子的统计方法,该框架将文本从一个或多个文档组织到不同的聚类中。传统的模糊聚类方法不适用于句子聚类,因为大多数句子相似性度量都不能表示公共度量空间中的句子。增强的模糊聚类算法应用于数据集的句子中以对相关句子进行分组。页面等级算法突出显示了更相关的内部集群,该集群解释了对象的页面等级。已经开发了期望最大化(EM)框架,以预测语义相关句子的重叠簇。报价数据集和新闻文章数据集从经验上暗示了匹配语义相似句子的相似性度量,其中我们的系统执行基线方法和投影方法。我们提出的方法在相关句子的相似度评分上提高了34%。它还从熵和纯度的角度分析聚类性能,从而产生更多的纯度和更少的熵。我们的实验结果表明,我们的方法能够识别语义相关句子的重叠簇,并可以用于多种文本挖掘任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号