A Clustering Algorithm for Asymmetrically Related Data with Applications to Text Mining

机译：一种非对称相关数据的聚类算法及其在文本挖掘中的应用

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering techniques find a collection of subsets of a data set such that the collection satisfies a criterion that is dependent on a relation defined on the data set. The underlying relation is traditionally assumed to be symmetric. However, there exist many practical scenarios where the underlying relation is asymmetric. One example of an asymmetric relation in text analysis is the inclusion relation, i.e., the inclusion of the meaning of a block of text in the meaning of another block. In this paper, we consider the general problem of clustering of asymmetrically related data and propose an algorithm to cluster such data. To demonstrate its usefulness, we consider two applications in text mining: (1) summarization of short documents, and (2) generation of a concept hierarchy from a set of documents. Our experiments show that the performance of the proposed algorithm is superior to that of more traditional algorithms.

机译：聚类技术找到数据集的子集的集合，以使该集合满足依赖于在数据集上定义的关系的标准。传统上将基础关系假定为对称的。但是，存在许多实际情况，其中基础关系是不对称的。文本分析中不对称关系的一个示例是包含关系，即以另一个块的含义包含一个文本块的含义。在本文中，我们考虑了非对称相关数据聚类的一般问题，并提出了一种对此类数据进行聚类的算法。为了证明其有用性，我们考虑了文本挖掘中的两个应用程序：（1）简短文档摘要，以及（2）从一组文档生成概念层次结构。我们的实验表明，所提算法的性能优于传统算法。

著录项

来源
《2001 ACM CIKM 10th International Conference on Information and Knowledge Management, 10th, Nov 5-10, 2001, Atlanta, Georgia, USA》|2001年|p.571-573|共3页
会议地点 Atlanta Georgia USA
作者
K. Krishna; Raghu Krishnapuram;
展开▼
作者单位

IBM India Research Lab Block 1,l IT, HauzKhas New Delhi 110016 INDIA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Massive Data Mining Algorithm for Web Text Based on Clustering Algorithm [J] . Nan-Chao Luo Journal of Advanced Computatioanl Intelligence and Intelligent Informatics . 2019,第2a136期

机译：基于聚类算法的Web文本大规模数据挖掘算法
2. Application of Data Mining Classification Algorithms for Afaan Oromo Media Text News Categorization [J] . Etana Fikadu Dinsa, Ramesh Babu P International Journal of Computer Trends and Technology . 2019,第7期

机译：数据挖掘分类算法在Afaan Oromo媒体文本新闻分类中的应用
3. A Novel Text Data Mining and Analysis Algorithm based on Information Entropy and Fuzzy Clustering [J] . Liping Wang, Wenzhun Huang Journal of Residuals Science & Technology . 2016,第5期

机译：基于信息熵和模糊聚类的文本数据挖掘与分析新算法
4. A clustering algorithm for asymmetrically related data with applications to text mining [C] . K. Krishna, Raghu Krishnapuram Proceedings of the Tenth international conference on Information and knowledge management . 2001

机译：一种非对称相关数据的聚类算法及其在文本挖掘中的应用
5. Prediction of cost overruns using ensemble methods in data mining and text mining algorithms. [D] . Ramesh, Prathiksha. 2014

机译：在数据挖掘和文本挖掘算法中使用集成方法预测成本超支。
6. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications [O] . Yiyan Zhang, Yi Xin, Qin Li, 2017

机译：七种数据挖掘算法在生物医学分类应用中不同数据集特征的实证研究
7. Learning from Multi-View Data: Clustering Algorithm and Text Mining Application [O] . Liu Xinhai 2011

机译：从多视图数据中学习：聚类算法和文本挖掘应用

A Clustering Algorithm for Asymmetrically Related Data with Applications to Text Mining

摘要

著录项

相似文献

相关主题

期刊订阅