Using ontology semantics to improve text documents clustering

机译：使用本体语义来改进文本文档群集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In order to improve the clustering results and select in the results, the ontology semantic is combined with document clustering. A new document clustering algorithm based WordNet in the phrase of document processing is proposed. First, every word vector by new entities is extended after the documents are represented by tf-idf. Then the feature extracting algorithm is applied for the documents. Finally, the algorithm of ontology aggregation clustering (OAC) is proposed to improve the result of document clustering. Experiments are based on the data set of Reuters 20 News Group, and experimental results are compared with the results obtained by mutual information (MI). The conclusion draws that the proposed algorithm of document clustering based on ontology is better than the other existed clustering algorithms such as MNB, CLUTO, co-clustering, etc.

机译：为了改善群集结果并选择结果，本体语义与文档聚类组合。提出了一种新的文档群集算法，在文档处理短语中的Wordnet。首先，在文档由TF-IDF表示后，新实体的每个单词矢量都会扩展。然后应用于文档的特征提取算法。最后，提出了本体群集聚合群集（OAC）的算法来改进文档聚类的结果。实验基于路透社20新闻组的数据集，并将实验结果与通过相互信息（MI）获得的结果进行比较。结论借鉴了基于本体的文档聚类算法优于其他存在的聚类算法，例如MNB，CLUTO，共聚类等。

著录项

来源
《Annual Workshop on Semantic Web and Ontology(SWON2006)》|2006年||共4页
会议地点
作者
Luo Na; Zuo Wanli; Yuan Fuyu; Zhang Jingbo; Zhang Huijie;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
ontology; text clustering; lexicon; WordNet;

机译：本体;文本聚类;词典;Wordnet;

相似文献

外文文献
中文文献
专利

1. 使用本体语义提高文本聚类 [J] . 罗娜, 左万利, 袁福宇, 东南大学学报（英文版） . 2006,第003期
2. Using ontology semantics to improve text documents clustering [J] . Luo Na, Zuo Wanli, Yuan Fuyu Journal of Southeast University . 2006,第3期

机译：使用本体语义改善文本文档聚类
3. DIC-DOC-K-means: Dissimilarity-based Initial Centroid selection for DOCument clustering using K-means for improving the effectiveness of text document clustering [J] . Lakshmi R., Baskar S. Journal of Information Science . 2019,第6期

机译：DIC-DOC-K-means：使用K-means的DOCument聚类基于不相似性的初始质心选择，以提高文本文档聚类的效率
4. Ontology Based Text Document Clustering for Sports [J] . A. Sudha Ramkumar, B. Poorna, B. Saleena Journal of Engineering & Applied Sciences . 2018,第11期

机译：基于本体的文本文档集群进行体育
5. Using ontology semantics to improve text documents clustering [C] . First Workshop on Semantic Web and Ontology(SWON2006)（全国首届语义Web与本体论学术研讨会）论文集 . 2006

机译：使用本体语义改善文本文档聚类
6. A comparative study on ontology generation and text clustering using VSM, LSI, and document ontology models. [D] . Taylor, William P., II. 2007

机译：使用VSM，LSI和文档本体模型进行本体生成和文本聚类的比较研究。
7. iSMART: Ontology-based Semantic Query of CDA Documents [O] . Shengping Liu, Yuan Ni, Jing Mei, 2009

机译：iSMART：CDA文档的基于本体的语义查询
8. Performance Evaluation of Semantic Based and Ontology Based Text Document Clustering Techniques [O] . Punitha S.C., Punithavalli M. 2012

机译：基于语义和基于本体的文本文档聚类技术的性能评估

Using ontology semantics to improve text documents clustering

摘要

著录项

相似文献

相关主题

期刊订阅