首页> 外文期刊>Nucleic Acids Research >A new systematic computational approach to predicting target genes of transcription factors
【24h】

A new systematic computational approach to predicting target genes of transcription factors

机译:预测转录因子靶基因的新系统计算方法

获取原文
获取原文并翻译 | 示例
           

摘要

Identifying transcription factor target genes (TFTGs) is a vital step towards understanding regulatory mechanisms of gene expression. Methods for the de novo identification of TFTGs are generally based on screening for novel DNA binding sites. However, experimental screening of new binding sites is a technically challenging, laborious and time-consuming task, while computational methods still lack accuracy. We propose a novel systematic computational approach for predicting TFTGs directly on a genome scale. Utilizing gene co-expression data, we modeled the prediction problem as a 'yes' or 'no' classification task by converting biological sequences into novel reverse-complementary position-sensitive n-gram profiles and implemented the classifiers with support vector machines. Our approach does not necessarily predict new DNA binding sites, which other studies have shown to be difficult and inaccurate. We applied the proposed approach to predict auxin-response factor target genes from published Arabidopsis thaliana co-expression data and obtained satisfactory results. Using ten-fold cross validations, the area under curve value of the receiver operating characteristic reaches around 0.73.
机译:识别转录因子靶基因(TFTGs)是迈向了解基因表达调控机制的重要一步。从头识别TFTG的方法通常基于筛选新的DNA结合位点。然而,新的结合位点的实验筛选是一项技术难题,费力且耗时的任务,而计算方法仍缺乏准确性。我们提出了一种新颖的系统计算方法,可直接在基因组规模上预测TFTG。利用基因共表达数据,我们通过将生物序列转换为新颖的反向互补位置敏感的n-gram配置文件,将预测问题建模为“是”或“否”分类任务,并使用支持向量机实现了分类器。我们的方法不一定能预测新的DNA结合位点,而其他研究表明这是困难且不准确的。我们应用所提出的方法从已发表的拟南芥共表达数据中预测生长素应答因子靶基因,并获得令人满意的结果。使用十倍交叉验证,接收器工作特性曲线值下的面积达到0.73左右。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号