...
首页> 外文期刊>Procedia Computer Science >Exploring Database Keyword Search for Association Studies between Genetic Variants and Diseases
【24h】

Exploring Database Keyword Search for Association Studies between Genetic Variants and Diseases

机译:探索数据库关键词搜索以进行遗传变异与疾病之间的关联研究

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Keyword search plays a critical role for researchers in bioinformatics to retrieve structured, semi-structured, and unstructured data. In addition, in order to fully exploit the rich repository of biological databases, data mining has drawn increasing attention of researchers. An interesting issue is to examine the possible relationship between database keyword search (DB KWS) and in- depth database exploration (or data mining) in the context of bioinformatics, and in particular, the potential contribution of DB KWS for data mining. However, so far there is no known systematic investigation on this relationship. In this paper, we provide a preliminary discussion on how we can take advantage of DB KWS for in-depth exploration of biological databases, and describe a case study on the association between genetic variants and diseases. The case study is motivated from the fact that the advent of high throughput sequencing technologies have facilitated in generating a huge amount of genomic data. A wealth of genomic information in the form of publicly available databases is underutilized as a potential resource for uncovering functionally relevant markers underlying complex human traits. The discovery of genetic associations is an important factor in the understanding of human illness to derive disease pathways and a plethora of other information such as the disease-gene associations, the variants associated with the diseases etc. A database was curated of the genome wide association studies, and an algorithm inspired by DBXplorer was used to implement the keyword search over the database in JAVA. The case study further proposes ways to include the association rule mining as a data mining technique, which is useful for discovering interesting relationships hidden in large data sets, to further investigate the results of the keyword search when done with different yet sensible combinations of disease and genes. We believe that such an integrated study to explore the potential of how bioinformatics can take advantage of both techniques in a single bioinformatics application would be a very interesting issue of both theoretical and practical importance.
机译:关键字搜索对于生物信息学的研究人员检索结构化,半结构化和非结构化数据起着至关重要的作用。此外,为了充分利用丰富的生物数据库,数据挖掘已引起研究人员的越来越多的关注。一个有趣的问题是在生物信息学的背景下研究数据库关键字搜索(DB KWS)与深度数据库探索(或数据挖掘)之间的可能关系,尤其是DB KWS对数据挖掘的潜在贡献。但是,到目前为止,还没有关于这种关系的系统研究。在本文中,我们对如何利用DB KWS进行生物学数据库的深入探索进行了初步讨论,并描述了遗传变异与疾病之间关联的案例研究。案例研究是基于以下事实:高通量测序技术的出现促进了生成大量基因组数据。以公开可用的数据库形式提供的大量基因组信息未被充分利用为揭示潜在的复杂人类特征的功能相关标记的潜在资源。遗传关联的发现是了解人类疾病以获取疾病途径和众多其他信息(例如疾病-基因关联,与疾病关联的变体等)的重要因素。建立了一个数据库,涵盖了全基因组关联研究,并且使用了DBXplorer启发的算法来在JAVA中对数据库进行关键字搜索。案例研究进一步提出了将关联规则挖掘作为一种数据挖掘技术的方法,该方法可用于发现隐藏在大型数据集中的有趣关系,以便在对疾病和疾病的不同但明智的组合进行搜索时进一步研究关键字搜索的结果基因。我们认为,这样的综合性研究将探讨生物信息学如何在单个生物信息学应用中利用这两种技术的潜力,这将是一个非常有趣的问题,具有理论和实践意义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号