Mining meaningful topics from massive biomedical literature

机译：从大量的生物医学文献中挖掘有意义的主题

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

There is huge amount of biomedical and biological literature online or in digital libraries. Moreover, new research papers are published with an exponential growth in recent years. So it is pressing and challenging to mine meaningful topics from massive biomedical literature. The mined topics are helpful to researchers for literature exploration and topic discovery. However, latent topics inferred by traditional topic models are not always coherent and meaningful. In this work, we propose a new methodology to mine meaningful biomedical topics with a combination of several off-the-shelf text mining techniques such as part-of-speech tagging, base noun phrase chunking, K-means clustering and latent Dirichlet allocation, which endow our methodology with scalability and implementation simplicity. We conduct comprehensive experiments on a dataset collected from PubMed. The experimental results demonstrate that our method significantly outperforms a baseline method. We also perform a qualitative analysis and present meaningful biomedical topics and multi-word expressions.

机译：在线或数字图书馆中有大量的生物医学和生物学文献。此外，近年来发表的新研究论文呈指数增长。因此，从大量的生物医学文献中挖掘有意义的话题是紧迫而又充满挑战的。挖掘的主题对研究人员进行文献探索和主题发现很有帮助。但是，由传统主题模型推断出的潜在主题并不总是连贯且有意义的。在这项工作中，我们提出了一种结合几种现成的文本挖掘技术（例如词性标记，基本名词短语分块，K均值聚类和潜在Dirichlet分配）来挖掘有意义的生物医学主题的新方法。这使我们的方法具有可扩展性和实现简单性。我们对从PubMed收集的数据集进行了全面的实验。实验结果表明，我们的方法明显优于基线方法。我们还将进行定性分析，并提出有意义的生物医学主题和多词表达方式。

著录项

来源
《IEEE International Conference on Bioinformatics and Biomedicine》|2014年|438-443|共6页
会议地点
作者
Peiyan Zhu; Junhui Shen; Dezhi Sun; Ke Xu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
biomedical engineering; data mining; digital libraries; information retrieval systems; information services; medical computing; pattern clustering; text analysis; K-means clustering; PubMed dataset; base noun phrase chunking; baseline method; digital library; latent Dirichlet allocation; latent topic inference; literature exploration; massive biomedical literature; meaningful biomedical topic mining; multi-word expression; off-the-shelf text mining techniques; online biological literature; online biomedical literature; part-of-speech tagging; qualitative analysis; research paper; topic discovery; traditional topic model; Arteries; Biological system modeling; Biomedical measurement; Cancer; Diseases; Semantics; Tagging;

机译：生物医学工程;数据挖掘;数字图书馆;信息检索系统;信息服务;医学计算;模式聚类;文本分析; K-均值聚类; PubMed数据集;基础名词短语分块;基线方法;数字图书馆;潜在Dirichlet分配;潜在主题推理;文学探索;大量生物医学文献;有意义的生物医学主题挖掘;多词表达;现成文本挖掘技术;在线生物学文献;在线生物医学文献;词性标注;定性分析;研究论文;主题发现;传统主题模型;动脉;生物系统建模;生物医学测量;癌症;疾病;语义;标记;

相似文献

外文文献
中文文献
专利

1. Mining Hidden Connections Among Biomedical Concepts from Disjoint Biomedical Literature Sets Through Semantic-Based Association Rule [J] . Xiaohua Hu, Xiaodan Zhang, Illhoi Yoo, International journal of entelligent systems . 2010,第2期

机译：通过基于语义的关联规则从不连续的生物医学文献集中挖掘生物医学概念之间的隐藏联系
2. Alkemio: association of chemicals with biomedical topics by text and data mining [J] . Jean F. Fontaine, José A. Gijón-Correas, Miguel A. Andrade-Navarro Nucleic acids research . 2014,第W1期

机译：Alkemio：通过文本和数据挖掘将化学物质与生物医学主题相关联
3. Mining Massive Amounts of Genomic Data: A Semiparametric Topic Modeling Approach [J] . Fang Ethan X., Li Min-Dian, Jordan Michael I., Journal of the American statistical association . 2017,第519期

机译：挖掘大量的基因组数据：一种半参数主题建模方法
4. Analysis of Protein/Protein Interactions Through Biomedical Literature: Text Mining of Abstracts vs. Text Mining of Full Text Articles [C] . Eric P.G. Martin, Eric G. Bremer, Marie-Claude Guerin, International Symposium on Knowledge Exploration in Life Science Informatics(KELSI 2004); 20041125-26; Milan(IT) . 2004

机译：通过生物医学文献分析蛋白质/蛋白质相互作用：摘要的文本挖掘与全文文章的文本挖掘
5. Biomedical Literature Mining and Knowledge Discovery of Phenotyping Definitions [D] . Binkheder, Samar. 2019

机译：生物医学文献挖掘与表型定义的知识发现
6. Mining Disease-Symptom Relation from Massive Biomedical Literature and Its Application in Severe Disease Diagnosis [O] . Eryu Xia, Wen Sun, Jing Mei, 2018

机译：大量生物医学文献探究病征关系及其在严重疾病诊断中的应用
7. Mining Hidden Connections among Biomedical Concepts from Disjoint Biomedical Literature Sets through Semantic-Based Association Rule [O] . Xiaohua Hu, Xiaodan Zhang, Illhoi Yoo, 2008

机译：通过基于语义的关联规则挖掘生物医学概念中生物医学概念的隐性联系
8. Text Mining the Biomedical Literature. [R] . Kostoff, R. N. 2007

机译：文本挖掘生物医学文献。

Mining meaningful topics from massive biomedical literature

摘要

著录项

相似文献

相关主题

期刊订阅