首页> 美国卫生研究院文献>other >A Novel Bioinformatics Method for Efficient Knowledge Discovery by BLSOM from Big Genomic Sequence Data
【2h】

A Novel Bioinformatics Method for Efficient Knowledge Discovery by BLSOM from Big Genomic Sequence Data

机译:BLSOM从大基因组序列数据中有效发现知识的新生物信息学方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a “genome signature,” and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data).
机译:随着各种物种的基因组序列数据的显着增加,需要新颖的工具对大序列数据进行综合分析。自组织图(SOM)是一种有效的工具,可在一张图上对高维数据(如寡核苷酸组成)进行聚类和可视化。通过修改常规的SOM,我们先前已经开发了批处理学习SOM(BLSOM),该批处理学习SOM(BLSOM)允许仅根据寡核苷酸组成而根据物种对序列片段进行分类。在本研究中,我们介绍了用于表征脊椎动物基因组序列的寡核苷酸BLSOM。为了研究一种有效的方法来检测密切相关的基因组之间的差异,我们首先分析了来自各种脊椎动物基因组的100kb序列中的五核苷酸组成,然后分析了人类和小鼠基因组中的五核苷酸组成。 BLSOM可以识别每个基因组中寡核苷酸频率的物种特异性键组合(称为“基因组签名”)和转录因子结合序列中特定富集的特定区域。由于分类和可视化功能非常强大,因此BLSOM是一种有效的强大工具,可从大量的基因组序列(即大序列数据)中提取大量信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号