...
首页> 外文期刊>Gene: An International Journal Focusing on Gene Cloning and Gene Structure and Function >Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes
【24h】

Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes

机译:自组织图(SOM)揭示并可视化了各种真核生物基因组的隐藏序列特征

获取原文
获取原文并翻译 | 示例

摘要

Novel tools are needed for comprehensive comparisons of interspecies characteristics of massive amounts of genomic sequences currently available. An unsupervised neural network algorithm, Self-Organizing Map (SOM), is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We modified the conventional SOM, oil the basis of batch-learning SOM, for genome informatics making the learning process and resulting map independent of the order of data input. We generated the SOMs for tri- and tetranucleotide frequencies in 10- and 100-kb sequence fragments from 38 eukaryotes for which almost complete genome sequences are available. SOM recognized species-specific characteristics (key combinations of oligonucleotide frequencies) in the genomic sequences, permitting species-specific classification of the sequences without any information regarding the species. We also generated the SOM for tetranucleotide frequencies in 1-kb sequence fragments from the human genome and found sequences for four functional categories (5' and 3' UTRs, CDSs and introns) were classified primarily according to the categories. Because the classification and visualization power is very high, SOM is an efficient and powerful tool for extracting a wide range of genome information. (C) 2005 Elsevier B.V. All rights reserved.
机译:需要新颖的工具来全面比较目前可用的大量基因组序列的种间特征。一种无监督的神经网络算法,即自组织图(SOM),是在单个图上聚类和可视化高维复杂数据的有效工具。我们修改了传统的SOM,为分批学习SOM奠定了基础,使基因组信息学的学习过程和生成的图谱与数据输入的顺序无关。我们从38个真核生物的10-kb和100-kb序列片段中生成了三核苷酸和四核苷酸频率的SOM,这些基因几乎可获得完整的基因组序列。 SOM可以识别基因组序列中的物种特异性特征(寡核苷酸频率的关键组合),从而可以对物种进行物种特异性分类,而无需任何有关物种的信息。我们还从人类基因组的1-kb序列片段中生成了四核苷酸频率的SOM,发现主要根据类别对四个功能类别(5'和3'UTR,CDS和内含子)的序列进行了分类。由于分类和可视化功能非常强大,因此SOM是提取大量基因组信息的有效而强大的工具。 (C)2005 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号