首页> 外文期刊>International Journal of Computer Trends and Technology >Mining Text Data using different Text Clustering Techniques
【24h】

Mining Text Data using different Text Clustering Techniques

机译:使用不同的文本聚类技术挖掘文本数据

获取原文
           

摘要

Text mining is referred as text data mining or knowledge discovery from textual databases. The organization of text is a natural practice of humans and a crucial task for today’s vast databases. Clustering does this by assessing the similarity between texts and organizing them accordingly, grouping like ones together and separating those with different topics. Clusters provide a comprehensive logical structure that provides exploration, search and interpretation of current texts documents, as well as organization of future ones. Side information is available along with the text documents and may be of different kinds, which are embedded into the text document. However this sideinformation may be difficult to estimate. In such cases, it can be risky to include sideinformation into the mining process, because it can either increase the quality of the representation for the mining process. Therefore, so as to maximize the advantages from using this side information, to minimize the time complexity of clustering process and to remove impurity of clusters partition based text clustering techniques are used like kmeans & kWindows algorithm. Experimental results show that, KWindows clustering technique is giving better results as compared to Kmeans clustering technique and also shows that side information is effectively used for mining the data.
机译:文本挖掘被称为文本数据挖掘或从文本数据库中发现知识。文本的组织是人类的自然习惯,并且是当今庞大数据库的一项关键任务。聚类通过评估文本之间的相似性并进行相应的组织,将相似的文本分组在一起,并将具有不同主题的文本分开来做到这一点。群集提供了一个全面的逻辑结构,该结构提供了对当前文本文档的探索,搜索和解释,以及对未来文本文档的组织。附带信息与文本文档一起可用,并且可以是嵌入文本文档中的不同类型。但是,此辅助信息可能难以估计。在这种情况下,将附带信息包含在挖掘过程中可能会带来风险,因为它可能会提高挖掘过程中表示的质量。因此,为了最大化利用此辅助信息的优势,最小化聚类过程的时间复杂度并消除聚类的杂质,使用了基于行的文本聚类技术,如kmeans&kWindows算法。实验结果表明,与Kmeans聚类技术相比,KWindows聚类技术具有更好的效果,并且表明辅助信息可有效地用于数据挖掘。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号