首页> 外文期刊>Nucleic Acids Research >Integrating sequence and gene expression information predicts genome-wide DNA-binding proteins and suggests a cooperative mechanism
【24h】

Integrating sequence and gene expression information predicts genome-wide DNA-binding proteins and suggests a cooperative mechanism

机译:整合序列和基因表达信息预测基因组的DNA结合蛋白,并提出了合作机制

获取原文
获取原文并翻译 | 示例
           

摘要

DNA-binding proteins (DBPs) perform diverse biological functions ranging from transcription to pathogen sensing. Machine learning methods can not only identify DBPs de novo but also provide insights into their DNA-recognition dynamics. However, it remains unclear whether available methods that can accurately predict DNA-binding sites in known DBPs can also identify novel DBPs. Moreover, sequence information is blind to the cellular-and disease-specific contexts of DBP activities, whereas the under-utilized knowledge from public gene expression data offers great promise. To address these issues, we have developed novel methods for predicting DBPs by integrating sequence and gene expression-derived features and applied them to explore human, mouse and Arabidopsis proteomes. While our sequence-based models outperformed the gene expression-based ones, some proteins with weaker DBP-like sequence features were correctly predicted by gene expression-based features, suggesting that these proteins acquire a tangible DBP functionality in a conducive gene expression environment. Analysis of motif enrichment among the co-expressed genes of top 100 candidates DBPs from hitherto unannotated genes provides further avenues to explore their functional associations.
机译:DNA结合蛋白(DBPS)进行不同的生物功能,从转录到病原体感测。机器学习方法不仅可以识别DBPS de Novo,而且还提供对其DNA识别动态的见解。然而,它仍然不清楚可以准确地预测已知DBPS中的DNA结合位点的可用方法还可以识别新型DBPS。此外,序列信息对DBP活动的细胞和疾病特异性背景视而不见,而来自公共基因表达数据的利用不利用的知识提供了很大的承诺。为了解决这些问题,我们通过整合序列和基因表达衍生的特征和应用它们来探索人,小鼠和拟南芥蛋白质组来制定用于预测DBP的新方法。虽然我们基于序列的型号优于基于基因表达的基因表达,但是通过基于基因的基因表达的特征正确预测了具有较弱的DBP样序列特征的蛋白质,表明这些蛋白质在有利基因表达环境中获得了有形的DBP功能。从迄今为止未催化基因的前100名候选者Dbps的联合表达基因中的基因富集分析提供了进一步的途径来探索其功能协会。

著录项

  • 来源
    《Nucleic Acids Research》 |2018年第1期|共17页
  • 作者单位

    Jawaharlal Nehru Univ Sch Computat &

    Integrat Sci New Delhi 110067 India;

    Natl Inst Biomed Innovat Hlth &

    Nutr Lab Bioinformat 7-6-8 Saito Asagi Osaka 5670085 Japan;

    Natl Inst Biomed Innovat Hlth &

    Nutr Lab Bioinformat 7-6-8 Saito Asagi Osaka 5670085 Japan;

    Natl Inst Biomed Innovat Hlth &

    Nutr Lab Bioinformat 7-6-8 Saito Asagi Osaka 5670085 Japan;

    Jawaharlal Nehru Univ Sch Computat &

    Integrat Sci New Delhi 110067 India;

    Natl Inst Biomed Innovat Hlth &

    Nutr Lab Bioinformat 7-6-8 Saito Asagi Osaka 5670085 Japan;

    Natl Inst Biomed Innovat Hlth &

    Nutr Lab Bioinformat 7-6-8 Saito Asagi Osaka 5670085 Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号