首页> 外文期刊>International Journal of Data Warehousing and Mining >A Graph-Based Biomedical Literature Clustering Approach Utilizing Term's Global and Local Importance Information
【24h】

A Graph-Based Biomedical Literature Clustering Approach Utilizing Term's Global and Local Importance Information

机译:基于术语的全局和局部重要性信息的基于图的生物医学文献聚类方法

获取原文
获取原文并翻译 | 示例
       

摘要

In this article, we present a graph-based knowledge representation for biomedical digital library literature clustering. An efficient clustering method is developed to identify the ontology-enriched k-highest density term subgraphs that capture the core semantic relationship information about each document cluster. The distance between each document and the k term graph clusters is calculated. A document is then assigned to the closest term cluster. The extensive experimental results on two PubMed document sets (Disease 10 and OHSUMED23) show that our approach is comparable to spherical k-means. The contributions of our approach are the following: (1) we provide two corpus-level graph representations to improve document clustering, a term co-occurrence graph and an abstract-title graph; (2) we develop an efficient and effective document clustering algorithm by identifying k distinguishable class-specific core term subgraphs using terms 'global and local importance information; and (3) the identified term clusters give a meaningful explanation for the document clustering results.
机译:在本文中,我们提出了一种基于图的知识表示形式,用于生物医学数字图书馆文献聚类。开发了一种有效的聚类方法,以识别可捕获有关每个文档聚类的核心语义关系信息的富含本体的k-最高密度项子图。计算每个文档与k个术语图簇之间的距离。然后将文档分配给最近的术语组。在两个PubMed文档集(疾病10和OHSUMED23)上的广泛实验结果表明,我们的方法与球形k均值相当。我们的方法的贡献如下:(1)我们提供了两个语料库级图表示法来改善文档聚类,一个术语共现图和一个抽象标题图; (2)通过使用术语“全局和局部重要性信息”来识别k个可区分的特定于类别的核心术语子图,从而开发出一种有效的文档聚类算法。 (3)识别出的术语聚类为文档聚类结果提​​供了有意义的解释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号