Text Clustering and Text Summarization on the Use of Side Information

机译：侧面信息使用的文本聚类和文本摘要

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering algorithm order information focuses on persuading social events concentrated around their similarity to abuse important data from data focuses. The end place of clustering these properties (text) has huge measure of information. It is difficult to measure relative data in light of the way in which the rate of the information is not clear. In such cases, it can be risky to partner side-data into the mining technique, since it can either build the nature of the representation for the mining system, then again add noise to the methodology. In various content mining applications, side-information is accessible nearby the content reports. Such text documents may be of a few sorts, for instance, record provenance information, the connections in the file, user access conduct from web logs, or other non-text based characteristics which are embedded into the content record. Such qualities may contain a massive measure of data for clustering purposes in the proposed system merge summarization methods. While executing the COATES estimation we used summarization system which is the union of duplicated clusters what's more, give last summary. COATES cluster algorithms we get the clusters on the establishment of substance what's more, auxiliary attributes. So in this project, an algorithm is designed, in order to give an effective clustering algorithm. Two algorithms are used in this project for clustering. In this paper COATES algorithm (this algorithm combines classical partitioning algorithms with probabilistic models) is used and the proposed system implements hierarchical algorithm which is compared with COATES algorithm and also implements the merging and summary generation algorithm which produces the summary or pure data for the user's convenience.

机译：聚类算法订单信息侧重于说服集中的社交事件，以滥用数据来自数据集中的重要数据。这些属性（文本）的结束地点具有巨大信息。鉴于信息速率尚不清楚的方式，难以测量相对数据。在这种情况下，将侧面数据融为挖掘技术可能是风险的，因为它可以构建挖掘系统的表示的性质，然后再次向方法添加噪声。在各种内容挖掘应用程序中，内容报告中可以访问侧面信息。这种文本文档可以是几种类型的，例如，记录出处信息，文件中的连接，来自Web日志的用户访问行为，或者嵌入到内容记录中的其他非文本的基于特征。这些品质可能包含在所提出的系统合并摘要方法中含有用于聚类目的的大量数据。在执行Coate估计的同时，我们使用的摘要系统是重复群集的结合，更重要的是，给上一个摘要。 COATES CLUSTAL算法我们在建立物质的情况下获得群集更多，辅助属性。因此，在该项目中，设计了一种算法，以便提供有效的聚类算法。在该项目中使用两种算法进行聚类。在本文中，使用了使用概率模型的经典分区算法（该算法与概率型号相结合），并且所提出的系统实现了与Coate算法进行比较的分层算法，并且还实现了为用户产生摘要或纯数据的合并和摘要生成算法方便。

著录项

来源
《International Conference on Innovations in Computer Science and Engineering》|2016年||共7页
会议地点
作者
Shilpa S. Raut; V.B. Maral;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-532;
关键词
Clustering; Text mining; Auxiliary attribute; Clustering methods; Summarization;

机译：聚类;文本挖掘;辅助属性;聚类方法;摘要;

相似文献

外文文献
中文文献
专利

1. 使用本体语义提高文本聚类 [J] . 罗娜, 左万利, 袁福宇, 东南大学学报（英文版） . 2006,第003期
2. Text Summarization Challenge 2 Text summarization evaluation at NTCIR Workshop 3 [J] . Manabu Okumura, Takahiro Fukusima, Hidetsugu Nanba, ACM SIGIR FORUM . 2004,第1期

机译：文字摘要挑战2 NTCIR研讨会3的文字摘要评估
3. Efficient text summarization method for blind people using text mining techniques [J] . Shakila Basheer, M. Anbarasi, Darpan Garg Sakshi, International journal of speech technology . 2020,第4期

机译：盲人使用文本挖掘技术的高效文本摘要方法
4. A SIMPLE AND EFFICIENT TEXT SUMMARIZATION MODEL FOR ODIA TEXT DOCUMENTS [J] . Sagarika Pattnaik, Ajit Kumar Nayak Indian Journal of Computer Science and Engineering . 2020,第6期

机译：ODIA文本文档的简单有效的文本摘要模型
5. Text Clustering and Text Summarization on the Use of Side Information [C] . Shilpa S. Raut, V.B. Maral International Conference on Innovations in Computer Science and Engineering . 2016

机译：侧面信息使用的文本聚类和文本摘要
6. Semantic preserving text representation and its applications in text clustering. [D] . Howard, Michael. 2012

机译：语义保留文本表示及其在文本聚类中的应用。
7. Towards Answering Biological Questions with Experimental Evidence: Automatically Identifying Text that Summarize Image Content in Full-Text Articles [O] . Hong Yu 2006

机译：尝试用实验证据回答生物学问题：自动识别全文文章中包含图像内容的文本
8. Text Summarization Challenge: An Evaluation Program for Text Summarization [O] . Hidetsugu Nanba, Tsutomu Hirao, Takahiro Fukushima, 2020

机译：文本摘要挑战：文本摘要评估计划

Text Clustering and Text Summarization on the Use of Side Information

摘要

著录项

相似文献

相关主题

期刊订阅