首页> 外文会议>International Conference on Systems Engineering >Experiments in text-based mining and analysis of biological information from MEDLINE on functionally-related genes

【24h】

Experiments in text-based mining and analysis of biological information from MEDLINE on functionally-related genes

机译：基于文本的挖掘和生物信息分析与功能相关基因的生物学信息分析

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Technological advancements such as microarrays have enabled biologists to generate unprecedented quantities of data about biological entities. This has lead to the development of a large number of algorithms for processing and analysis of biological data. Challenges however remain; for instance, genes that function cooperatively need not have similar expression patterns. This suggests the use of non-numerical sources of information to explore the underlying biology. We experimentally study various factors that are inherent in algorithmic methodologies for text analysis. The proposed method accesses MEDLINE dynamically to account for the latest research, with the available literature corresponding to the genes analyzed to develop lists of keywords. Natural language processing (NLP) techniques such as stop-word filtering and stemming are then applied to the lists, and keyword frequencies weighted using the term frequency-inverse document frequency (TFIDF) scheme. The results are input to a hierarchical clustering algorithm to derive groupings of genes by functionality. The process is repeated using z-score weighting and latent semantic analysis (LSA) to determine which yields the most accurate clustering. The study presented examines the importance of these steps and their influence on the overall efficacy of the system. We believe that the analysis conducted as part of this research is invaluable to development and fine-timing of text mining methodologies for biological literature.

机译：微阵列等技术进步使生物学家能够产生关于生物实体的前所未有的数量。这导致了大量算法进行加工和分析生物数据。然而仍然存在挑战;例如，协同功能的基因不需要具有类似的表达模式。这表明使用非数值信息来源来探索潜在的生物学。我们通过实验研究文本分析算法方法中固有的各种因素。该提出的方法可动态访问MEDLINE，以考虑最新的研究，其中可用文献对应于分析的基因开发关键字列表。然后将自然语言处理（NLP）技术（如止血滤波和抛出）应用于列表，并且使用术语频率反转文档频率（TFIDF）方案加权的关键字频率。结果输入到分层聚类算法，以通过功能导出基因的分组。使用Z-Score加权和潜在语义分析（LSA）重复该过程以确定哪个产生最准确的聚类。本研究表明，研究了这些步骤的重要性及其对系统整体疗效的影响。我们认为，作为本研究的一部分进行的分析对于生物学文献的文本挖掘方法的开发和微观时间非常无价。

著录项

来源
《International Conference on Systems Engineering》|2005年||共6页
会议地点
作者
Moon N.; Singh R.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词
biology computing; data analysis; data mining; genetics; medical information systems; natural languages; pattern clustering; scientific information systems; text analysis; word processing; LSA; MEDLINE; biological data analysis; biological data processing algorithm; b;

机译：生物学计算;数据分析;遗传;遗传学;医学信息系统;自然语言;模式聚类;科学信息系统;文本分析;文字分析;词处理;LSA;MEDLINE;生物数据分析;生物数据处理算法;B;

相似文献

外文文献
中文文献
专利

1. New results in biological sequence analysis, complex genedisease association, qPCR calculation, and biological text mining [J] . Wong L. Journal of Bioinformatics and Computational Biology . 2010,第5期

机译：生物序列分析，复杂疾病关联，qPCR计算和生物文本挖掘的新结果
2. NEW RESULTS IN BIOLOGICAL SEQUENCE ANALYSIS, COMPLEX GENE–DISEASE ASSOCIATION, qPCR CALCULATION, AND BIOLOGICAL TEXT MINING [J] . Journal of Bioinformatics and Computational Biology . 2010,第5期

机译：生物序列分析，复杂基因-疾病关联，qPCR计算和生物文本挖掘的新结果
3. System Analysis of LWDH Related Genes Based on Text Mining in Biological Networks [J] . Mingzhi Liao, Yingbo Miao, Liangcai Zhang, BioMed research international . 2014,第51期

机译：基于文本挖掘在生物网络中的LWDH相关基因的系统分析
4. Experiments in text-based mining and analysis of biological information from MEDLINE on functionally-related genes [C] . Moon, N., Singh, . 2005

机译：基于文本的挖掘和MEDLINE功能相关基因生物学信息分析实验
5. Text mining biomedical literature for improving MEDLINE retrieval. [D] . Lin, Yongjing. 2008

机译：文本挖掘生物医学文献，以改善MEDLINE检索。
6. System Analysis of LWDH Related Genes Based on Text Mining in Biological Networks [O] . Mingzhi Liao, Yingbo Miao, Liangcai Zhang, -1

机译：生物网络中基于文本挖掘的LWDH相关基因系统分析
7. 384 COMBINING CHONDROCYTE GENE EXPRESSION, LITERATURE MINING AND PATHWAY/NETWORK ANALYSIS TO EXTRACT BIOLOGICAL INSIGHTS FROM SMALL-SCALE MICROARRAY DATA [O] . Glaab E., Clutterbuck A.L., Bacardit J., 2010

机译：384结合软骨细胞基因表达，文献挖掘和路径/网络分析，从小规模微阵列数据中提取生物信息
8. Finding Functionally Related Genes by Local and Global Analysis of MEDLINE Abstracts. [R] . Nakken, S., Kauffman, C., Karypis, G. 2004

机译：通过mEDLINE摘要的局部和全局分析寻找功能相关基因。

Experiments in text-based mining and analysis of biological information from MEDLINE on functionally-related genes

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅