首页> 外文会议>International conference on intelligence systems for molecular biology >Building Dictionaries Of 1D and 3D Motifs By Mining The Unaligned 1D Sequences Of 17 Archaeal and Bacterial Genomes
【24h】

Building Dictionaries Of 1D and 3D Motifs By Mining The Unaligned 1D Sequences Of 17 Archaeal and Bacterial Genomes

机译:通过挖掘17个古物和细菌基因组的未对准1D序列构建1D和3D主题的词典

获取原文

摘要

We have used the TEIRESIAS algorithm to carry out unsuper-vised pattern discovery in a database containing the unaligned ORFs from the 17 publicly available complete archaeal and bacterial genomes and build a 1D dictionary of motifs. These motifs which we refer to as seqlets account for and cover 97.88% of this genomic input at the level of amino acid positions. Each of the seqlets in this 1D dictionary was located among the sequences in Release 38.0 of the Protein Data Bank and the structural fragments corresponding to each seqlet's instances were identified and aligned in three dimensions: those of the seqlets that resulted in RMSD errors below a pre-selected threshold of 2.5 Angstroms were entered in a 3D dictionary of structurally conserved seqlets. These two dictionaries can be thought of as cross-indices that facilitate the tackling of tasks such as automated functional annotation of genomic sequences, local homology identification, local structure characterization, comparative genomics, etc.
机译:我们使用Teireisias算法在包含来自17个公共可用的古物和细菌基因组的未对准ORF的数据库中执行无核心的模式发现,并建立一个图案的1D字典。我们将其作为SEQLETS称为SEQLET的这些图案,并在氨基酸位置的水平下覆盖该基因组输入的97.88%。该1D字典中的每个SEQLET位于蛋白质数据库的释放38.0中的序列中,并且识别对应于每个SEQLET的情况的结构片段,并在三个维度中识别并对齐:导致前面的RMSD误差的SEQLET。在结构保守的SEQLET的3D字典中输入2.5埃的选择阈值。这两个词典可以被认为是跨指数,便于处理任务,例如自动功能序列的自动功能注释,局部同源性识别,局部结构表征,比较基因组学等。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号