首页> 美国卫生研究院文献>Nucleic Acids Research >PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION
【2h】

PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION

机译:PreCisIon:基因位置改善了对CIS调控元件的预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.
机译:预测转录调控相互作用的常规方法通常依赖于转录因子(TF)靶基因上共享基序序列的定义。 TF结合位点基序的有限可用性和准确性(通常表示为特定位置的评分矩阵)可能使这些努力受到挫败,这可能会匹配大量位点并产生不可靠的靶基因列表。为了改善结合位点的预测,我们建议另外使用与基因组布局无关的知识。实际上,已经显示出共同调节的基因倾向于沿着整个染色体相邻或周期性地间隔开。这项研究表明相应的基因定位携带重要的信息。这种新型信息通过称为PreCisIon的机器学习算法与传统序列信息结合在一起。为了优化这种组合,PreCisIon通过基于局部结合序列或全局基因位置来自适应组合弱分类器,从而构建了一个强大的基因目标分类器。这种策略通常为基于局部序列,基因组布局或基于新标准的基因靶标预测的任何未来进展的优化整合铺平道路。利用当前的现有技术,PreCisIon始终仅基于序列信息来不断改进方法。通过对来自两个系统发育远期模型生物的20种主要TF进行交叉验证分析,可以证明这一点。对于枯草芽孢杆菌和大肠杆菌,PreCisIon分别在接收器工作特性曲线下平均获得70和60%的面积,80和70%的灵敏度以及60和56%的特异性。经基因本体论富集分析或相关文献和数据库评估,新预测的基因靶标与先前已知的靶标在功能上是一致的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号