首页> 外文期刊>IEEE Transactions on Neural Networks >Motif discoveries in unaligned molecular sequences using self-organizing neural networks
【24h】

Motif discoveries in unaligned molecular sequences using self-organizing neural networks

机译:使用自组织神经网络在未比对的分子序列中进行基序发现

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we study the problem of motif discoveries in unaligned DNA and protein sequences. The problem of motif identification in DNA and protein sequences has been studied for many years in the literature. Major hurdles at this point include computational complexity and reliability of the search algorithms. We propose a self-organizing neural network structure for solving the problem of motif identification in DNA and protein sequences. Our network contains several layers, with each layer performing classifications at different levels. The top layer divides the input space into a small number of regions and the bottom layer classifies all input patterns into motifs and nonmotif patterns. Depending on the number of input patterns to be classified, several layers between the top layer and the bottom layer are needed to perform intermediate classifications. We maintain a low computational complexity through the use of the layered structure so that each pattern's classification is performed with respect to a small subspace of the whole input space. Our self-organizing neural network will grow as needed (e.g., when more motif patterns are classified). It will give the same amount of attention to each input pattern and will not omit any potential motif patterns. Finally, simulation results show that our algorithm outperforms existing algorithms in certain aspects. In particular, simulation results show that our algorithm can identify motifs with more mutations than existing algorithms. Our algorithm works well for long DNA sequences as well.
机译:在本文中,我们研究了未比对的DNA和蛋白质序列中基序发现的问题。 DNA和蛋白质序列中的基序识别问题已经在文献中研究了很多年。此时的主要障碍包括搜索算法的计算复杂性和可靠性。我们提出了一种自组织神经网络结构,用于解决DNA和蛋白质序列中的基序识别问题。我们的网络包含几层,每一层执行不同级别的分类。顶层将输入空间划分为少量区域,底层将所有输入模式分类为主题和非主题模式。根据要分类的输入模式的数量,需要在顶层和底层之间进行多层分类以执行中间分类。通过使用分层结构,我们保持了较低的计算复杂度,从而针对整个输入空间的一小部分子空间执行每个模式的分类。我们的自组织神经网络将根据需要增长(例如,当更多主题图案被分类时)。它将对每个输入模式给予同等的关注,并且不会忽略任何潜在的主题模式。最后,仿真结果表明我们的算法在某些方面优于现有算法。尤其是,仿真结果表明,与现有算法相比,我们的算法可以识别出更多变异的主题。我们的算法也适用于长DNA序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号