首页> 外文期刊>International Journal of Information Management >Network text analysis: A two-way classification approach
【24h】

Network text analysis: A two-way classification approach

机译:网络文本分析:两种分类方法

获取原文
获取原文并翻译 | 示例
           

摘要

Text clustering is a well-known method for information retrieval and numerous methods for classifying words, documents or both together have been proposed. Frequently, textual data are encoded using vector models so the corpus is transformed in to a matrix of terms by documents; using this representation text clustering generates groups of similar objects on the basis of the presence/absence of the words in the documents. An alternative way to work on texts is to represent them as a network where nodes are entities connected by the presence and distribution of the words in the documents. In this work, after summarising the state of the art of text clustering we will present a new network approach to textual data. We undertake text co-clustering using methods developed for social network analysis. Several experimental results will be presented to demonstrate the validity of the approach and the advantages of this technique compared to existing methods.
机译:文本聚类是一种众所周知的信息检索方法,并且已经提出了许多将单词,文档或两者一起分类的方法。通常,文本数据使用矢量模型进行编码,因此文档将语料库转换为术语矩阵。使用此表示形式,文本聚类根据文档中单词的存在/不存在生成相似对象的组。处理文本的另一种方法是将它们表示为一个网络,其中节点是通过文档中单词的存在和分布而连接的实体。在这项工作中,在总结了文本聚类的最新技术之后,我们将提出一种新的文本数据网络方法。我们使用为社交网络分析开发的方法进行文本联合聚类。将提供一些实验结果,以证明该方法的有效性以及与现有方法相比该技术的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号