首页> 美国卫生研究院文献>Molecular Therapy. Nucleic Acids >Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture
【2h】

Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture

机译:使用基于共享混合深度学习架构的DNA形状预测转录因子结合位点

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. Recent research has shown that the double helix structure of nucleotides plays an important role in improving the accuracy and interpretability of transcription factor binding sites (TFBSs). Although several computational methods have been designed to take both DNA sequence and DNA shape features into consideration simultaneously, how to design an efficient model is still an intractable topic. In this paper, we proposed a hybrid convolutional recurrent neural network (CNN/RNN) architecture, CRPTS, to predict TFBSs by combining DNA sequence and DNA shape features. The novelty of our proposed method relies on three critical aspects: (1) the application of a shared hybrid CNN and RNN has the ability to efficiently extract features from large-scale genomic sequences obtained by high-throughput technology; (2) the common patterns were found from DNA sequences and their corresponding DNA shape features; (3) our proposed CRPTS can capture local structural information of DNA sequences without completely relying on DNA shape data. A series of comprehensive experiments on 66 in vitro datasets derived from universal protein binding microarrays (uPBMs) shows that our proposed method CRPTS obviously outperforms the state-of-the-art methods.
机译:转录调节的研究仍然困难,但分子生物学研究仍然是基础。最近的研究表明,核苷酸的双螺旋结构在提高转录因子结合位点(TFBS)的准确性和解释性方面发挥着重要作用。虽然已经设计了几种计算方法以同时考虑DNA序列和DNA形状特征,但如何设计有效模型仍然是一个难以处理的主题。在本文中,我们提出了一种杂交卷积复发性神经网络(CNN / RNN)架构,CRPT,通过组合DNA序列和DNA形状特征来预测TFBS。我们提出的方法的新颖性依赖于三个关键方面:(1)共享杂种CNN和RNN的应用能够有效地提取来自高通量技术获得的大规模基因组序列的特征; (2)从DNA序列和它们相应的DNA形状特征中发现常见模式; (3)我们所提出的CRPT可以捕获DNA序列的局部结构信息,而无需完全依赖于DNA形状数据。来自66个体外数据集的一系列综合实验来自通用蛋白质结合微阵列(UPBMS)表明我们所提出的方法CRPT显着优于最先进的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号