...
首页> 外文期刊>Current Bioinformatics >Classification of Chromosomal DNA Sequences Using Hybrid Deep Learning Architectures | Bentham Science
【24h】

Classification of Chromosomal DNA Sequences Using Hybrid Deep Learning Architectures | Bentham Science

机译:使用杂交深度学习架构进行染色体DNA序列的分类| Bentham Science.

获取原文
获取原文并翻译 | 示例
           

摘要

Background: Chromosomal DNA contains most of the genetic information ofeukaryotes and plays an important role in the growth, development and reproduction of livingorganisms. Most chromosomal DNA sequences are known to wrap around histones, anddistinguishing these DNA sequences from ordinary DNA sequences is important for understandingthe genetic code of life. The main difficulty behind this problem is the feature selection process.DNA sequences have no explicit features, and the common representation methods, such as onehotcoding, introduced the major drawback of high dimensionality. Recently, deep learning modelshave been proved to be able to automatically extract useful features from input patterns.Objective: We aim to investigate which deep learning networks could achieve notableimprovements in the field of DNA sequence classification using only sequence information.Methods: In this paper, we present four different deep learning architectures using convolutionalneural networks and long short-term memory networks for the purpose of chromosomal DNAsequence classification. Natural language model(Word2vec)was used to generate word embeddingof sequence and learn features from it by deep learning.Results: The comparison of these four architectures is carried out on 10 chromosomal DNAdatasets. The results show that the architecture of convolutional neural networks combined withlong short-term memory networks is superior to other methods with regards to the accuracy ofchromosomal DNA prediction.Conclusion: In this study, four deep learning models were compared for an automatic classificationof chromosomal DNA sequences with no steps of sequence preprocessing. In particular, we haveregarded DNA sequences as natural language and extracted word embedding with Word2Vec torepresent DNA sequences. Results show a superiority of the CNN+LSTM model in the tenclassification tasks. The reason for this success is that the CNN module captures the regulatorymotifs, while the following LSTM layer captures the long-term dependencies between them.
机译:None

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号