首页> 外文期刊>Machine Learning and Knowledge Extraction >Exploiting Genomic Relations in Big Data Repositories by Graph-Based Search Methods
【24h】

Exploiting Genomic Relations in Big Data Repositories by Graph-Based Search Methods

机译:通过基于图的搜索方法开发大数据存储库中的基因组关系

获取原文
           

摘要

We are living at a time that allows the generation of mass data in almost any field of science.For instance, in pharmacogenomics, there exist a number of big data repositories, e.g., the Library ofIntegrated Network-based Cellular Signatures (LINCS) that provide millions of measurements on thegenomics level. However, to translate these data into meaningful information, the data need to beanalyzable. The first step for such an analysis is the deliberate selection of subsets of raw data forstudying dedicated research questions. Unfortunately, this is a non-trivial problem when millions ofindividual data files are available with an intricate connection structure induced by experimentaldependencies. In this paper, we argue for the need to introduce such search capabilities for biggenomics data repositories with a specific discussion about LINCS. Specifically, we suggest theintroduction of smart interfaces allowing the exploitation of the connections among individual rawdata files, giving raise to a network structure, by graph-based searches.
机译:我们生活的时代几乎可以在任何科学领域生成海量数据。例如,在药物基因组学中,存在许多大数据存储库,例如,基于网络的集成网络签名库(LINCS)可提供数以百万计的基因组学测量。但是,要将这些数据转换为有意义的信息,则需要对数据进行分析。进行此类分析的第一步是为研究专门的研究问题而精心选择原始数据的子集。不幸的是,当数百万个单独的数据文件具有由实验依赖性引起的复杂的连接结构时,这是一个不小的问题。在本文中,我们认为有必要针对大基因组学数据存储库引入此类搜索功能,并专门讨论LINCS。具体而言,我们建议引入智能接口,以允许通过基于图的搜索来利用各个原始数据文件之间的连接,从而提高网络结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号