Semantic based clustering of Web documents

机译：Web文档的基于语义的群集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A new methodology that structures the semantics of a collection of documents into the geometry of a simplicial complex is developed: a primitive concept is represented by a top dimension simplex, and a connected component represents a concept. Based on these structures, documents can be clustered into some meaningful classes. Experiments with three different data sets from web pages and medical literature have shown that the proposed unsupervised clustering approach performs significantly better than traditional clustering algorithms, such as k-means, AutoClass and hierarchical clustering (HAC). This abstract geometric model seems have captured the intrinsic semantics of the documents.

机译：开发了一种将文档集合的语义结构化为简单复合体的几何结构的新方法：原始概念由顶级单形表示，而连接的组件表示概念。基于这些结构，文档可以分为一些有意义的类。从网页和医学文献中对三种不同数据集进行的实验表明，所提出的无监督聚类方法的性能明显优于传统聚类算法，例如k-means，AutoClass和分层聚类（HAC）。这个抽象的几何模型似乎已经捕获了文档的固有语义。

著录项

来源
《Granular Computing, 2005 IEEE International Conference on》|2005年|P.189-192|共4页
会议地点
作者
Lin; T.Y.; I-Jen Chiang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类工业技术;
关键词
document handling; geometry; pattern clustering; semantic Web; Web document; Web page; abstract geometric model; data set; semantic document collection; simplicial complex geometry; unsupervised clustering; clustering; document; polyhedron; semantics; web;

机译：文档处理;几何;模式聚类;语义Web; Web文档; Web页面;抽象几何模型;数据集;语义文档收集;单纯复杂几何;无监督聚类;聚类;文档;多面体;语义; Web;

相似文献

外文文献
中文文献
专利

1. An effective approach for semantic-based clustering and topic-based ranking of web documents [J] . Rajendra Kumar Roul International Journal of Data Science and Analytics . 2018,第4期

机译：Web文档基于语义的聚类和基于主题的排名的有效方法
2. Semantic Similarity-Based Clustering of Web Documents Using Fuzzy C-Means [J] . J. Avanija, K. Ramar International Journal of Computational Intelligence and Applications . 2015,第3期

机译：基于语义相似度的Web文档模糊C均值聚类
3. Semantic Clustering of Web Documents: An Ontology based Approach Using Swarm Intelligence [J] . J. Avanija, K. Ramar International journal of information technology and web engineering . 2012,第4期

机译：Web文档的语义聚类：使用群体智能的基于本体的方法
4. Semantic based clustering of Web documents [C] . Lin T.Y., I-Jen Chiang IEEE International Conference on Granular Computing . 2005

机译：基于语义的Web文档群集
5. Incorporating semantic and syntactic information into document representation for document clustering. [D] . Wang, Yong. 2005

机译：将语义和句法信息合并到文档表示中以进行文档聚类。
6. Semantic querying of relational data for clinical intelligence: a semantic web services-based approach [O] . Alexandre Riazanov, Artjom Klein, Arash Shaban-Nejad, 2013

机译：关系数据的语义查询以用于临床智能：基于语义Web服务的方法
7. Hierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics [O] . behnam taheri khameneh, hamid shokrzadeh 2020

机译：用于发现潜在语义的Web文档中的分层模糊群集语义（HFCS）
8. Toward Webscale, Rule-Based Inference on the Semantic Web Via Data Parallelism. [R] . Weaver, J. 2013

机译：走向Webscale，基于规则的语义Web推理通过数据并行。

Semantic based clustering of Web documents

摘要

著录项

相似文献

相关主题

期刊订阅