首页> 美国卫生研究院文献>AMIA Summits on Translational Science Proceedings >PDC – a probabilistic distributional clustering algorithm: a case study on suicide articles in PubMed
【2h】

PDC – a probabilistic distributional clustering algorithm: a case study on suicide articles in PubMed

机译:PDC –概率分布聚类算法:PubMed中自杀文章的案例研究

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The need to organize a large collection in a manner that facilitates human comprehension is crucial given the ever-increasing volumes of information. In this work, we present PDC (probabilistic distributional clustering), a novel algorithm that, given a document collection, computes disjoint term sets representing topics in the collection. The algorithm relies on probabilities of word co-occurrences to partition the set of terms appearing in the collection of documents into disjoint groups of related terms. In this work, we also present an environment to visualize the computed topics in the term space and retrieve the most related PubMed articles for each group of terms. We illustrate the algorithm by applying it to PubMed documents on the topic of suicide. Suicide is a major public health problem identified as the tenth leading cause of death in the US. In this application, our goal is to provide a global view of the mental health literature pertaining to the subject of suicide, and through this, to help create a rich environment of multifaceted data to guide health care researchers in their endeavor to better understand the breadth, depth and scope of the problem. We demonstrate the usefulness of the proposed algorithm by providing a web portal that allows mental health researchers to peruse the suicide-related literature in PubMed.
机译:考虑到信息量的不断增长,以易于理解的方式组织大型馆藏的需求至关重要。在这项工作中,我们提出了PDC(概率分布聚类),这是一种新颖的算法,在给定文档集合的情况下,它可以计算表示集合中主题的不相交术语集。该算法依靠单词共现的概率将出现在文档集合中的术语集划分为不相关的相关术语组。在这项工作中,我们还提供了一个环境,以可视化术语空间中计算出的主题,并为每组术语检索最相关的PubMed文章。我们通过将其应用于自杀主题的PubMed文档来说明该算法。自杀是主要的公共卫生问题,被认为是美国第十大死亡原因。在此应用程序中,我们的目标是提供有关自杀主题的心理健康文献的全局视图,并以此帮助创建丰富的多方面数据环境,以指导卫生保健研究人员更好地了解其广度,问题的深度和范围。我们通过提供一个允许心理健康研究人员仔细阅读PubMed中与自杀有关的文献的门户网站,证明了该算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号