Semantically Document Clustering Using Contextual Similarities

R. Nagaraj; X. Agnise Kalarani

首页> 外文期刊>International Journal of Applied Engineering Research >Semantically Document Clustering Using Contextual Similarities

【24h】

Semantically Document Clustering Using Contextual Similarities

机译：使用上下文相似性语义文档聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Efficient Document clustering can be performed based on the term level, sentence level and concept level techniques in the high dimensional document space. Most of the existing techniques have problems such as two-variable problem, high computational time and low similarity relatedness which reduces the clustering efficiency. To overcome the existing drawbacks, a hybrid clustering algorithm called Semantically Document Clustering algorithm is proposed in this paper. The Semantically Document Clustering algorithm is developed by combining the features of Directed Ridge Regression (DRR), Fuzzy relational Hierarchical clustering (FHC) and Conceptual clustering methods presented in our previous researches. The proposed Semantically Document Clustering algorithm utilizes the semantic weight of terms related to the concepts from Wikipedia and Word Net to categorize the texts in the documents. Then the similarity between the sentences is calculated by using the Jiang and Conrath measure which considers the concept weight and the similarity measure for effective clustering. The direct ridge regression is applied to build a Laplacian matrix and the diagonal elements of the normalized Laplacian matrix are varied to solve the two-variable problem. Then the fuzzy hierarchical rules are employed to classify the rows of the normalized Laplacian matrix into classes for calculating the membership for the observations and the center vectors. Thus the term relatedness, sentence relatedness and concept relatedness can be calculated and the documents can be clustered efficiently. Experiment results also show that the proposed hybrid approach Semantically Document Clustering method provides more accurate document clustering than the state-of-the-art clustering methods.

机译：可以基于高维文档空间中的术语级别，句子级别和概念级别技术来执行有效的文档聚类。现有技术大多存在二变量问题，计算时间长，相似度相关性低等问题，降低了聚类效率。为了克服现有的缺点，提出了一种称为语义文档聚类的混合聚类算法。语义文档聚类算法是结合先前研究中提出的定向岭回归（DRR），模糊关系层次聚类（FHC）和概念聚类方法的特点而开发的。提出的语义文档聚类算法利用与Wikipedia和Word Net中的概念相关的术语的语义权重来对文档中的文本进行分类。然后，使用Jiang和Conrath度量来计算句子之间的相似度，该度量考虑了概念权重和有效聚类的相似性度量。应用直接岭回归建立拉普拉斯矩阵，并改变归一化拉普拉斯矩阵的对角元素，以解决二变量问题。然后，采用模糊层次规则将归一化的拉普拉斯矩阵的行分类为用于计算观测值和中心向量的隶属关系的类。因此，可以计算术语相关性，句子相关性和概念相关性，并且可以有效地对文档进行聚类。实验结果还表明，提出的混合方法语义文档聚类方法比最新的聚类方法提供了更准确的文档聚类。

著录项

来源
《International Journal of Applied Engineering Research》 |2016年第1期|共6页
作者
R. Nagaraj; X. Agnise Kalarani;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类工程基础科学;
关键词
Directed Ridge Regression; Fuzzy relational Hierarchical clustering; Conceptual clustering;

机译：定向岭回归;模糊关系层次聚类;概念聚类;

相似文献

外文文献
中文文献
专利

1. Semantically Document Clustering Using Contextual Similarities [J] . R. Nagaraj, X. Agnise Kalarani International Journal of Applied Engineering Research . 2016,第1aPta1期

机译：使用上下文相似性语义文档聚类
2. Survey on Semantic Similarity Based on Document Clustering [J] . Rowaida Khalil Ibrahim, Subhi Rafeeq Mohammed Zeebaree, Karwan Fahmi Sami Jacksi Advances in Science, Technology and Engineering Systems . 2019,第5期

机译：基于文档聚类的语义相似度调查
3. Fine-Tuning an Algorithm for Semantic Document Clustering Using a Similarity Graph [J] . Lubomir Stanchev International journal of semantic computing . 2016,第4期

机译：使用相似度图微调语义文档聚类算法
4. Document Clustering Method Using Weighted Semantic Features and Cluster Similarity [C] . Sun Park, Dong Un An, Choi Im Cheon Third IEEE International Conference on Digital Game and Intelligent Toy Enhanced Learning (DIGITEL 2010) . 2010

机译：利用加权语义特征和聚类相似度的文档聚类方法
5. Incorporating semantic and syntactic information into document representation for document clustering. [D] . Wang, Yong. 2005

机译：将语义和句法信息合并到文档表示中以进行文档聚类。
6. Bridging the gap: incorporating a semantic similarity measure for effectively mapping PubMed queries to documents [O] . Sun Kim, Nicolas Fiorini, W. John Wilbur, -1

机译：缩小差距：纳入语义相似性度量以有效将PubMed查询映射到文档
7. Semantic Document Clustering Using a Similarity Graph [O] . Lubomir Stanchev 2016

机译：使用相似性图形的语义文档聚类

Semantically Document Clustering Using Contextual Similarities

摘要

著录项

相似文献

相关主题

期刊订阅