首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Transfer String Kernel for Cross-Context DNA-Protein Binding Prediction
【24h】

Transfer String Kernel for Cross-Context DNA-Protein Binding Prediction

机译:跨字符串DNA蛋白质结合预测的传输字符串内核

获取原文
获取原文并翻译 | 示例

摘要

Through sequence-based classification, this paper tries to accurately predict the DNA binding sites of transcription factors (TFs) in an unannotated cellular context. Related methods in the literature fail to perform such predictions accurately, since they do not consider sample distribution shift of sequence segments from an annotated (source) context to an unannotated (target) context. We, therefore, propose a method called "Transfer String Kernel" (TSK) that achieves improved prediction of transcription factor binding site (TFBS) using knowledge transfer via cross-context sample adaptation. TSK maps sequence segments to a high-dimensional feature space using a discriminative mismatch string kernel framework. In this high-dimensional space, labeled examples of the source context are re-weighted so that the revised sample distribution matches the target context more closely. We have experimentally verified TSK for TFBS identifications on 14 different TFs under a cross-organism setting. We find that TSK consistently outperforms the state-of-the-art TFBS tools, especially when working with TFs whose binding sequences are not conserved across contexts. We also demonstrate the generalizability of TSK by showing its cutting-edge performance on a different set of cross-context tasks for the MHC peptide binding predictions.
机译:通过基于序列的分类,本文尝试在无注释的细胞环境中准确预测转录因子(TF)的DNA结合位点。文献中的相关方法无法准确地执行此类预测,因为它们没有考虑序列段从注释(源)上下文到未注释(目标)上下文的样本分布偏移。因此,我们提出了一种称为“传输字符串内核”(TSK)的方法,该方法可通过跨上下文样本自适应使用知识转移来实现对转录因子结合位点(TFBS)的改进预测。 TSK使用区分性的不匹配字符串内核框架将序列段映射到高维特征空间。在这个高维空间中,对源上下文的标记示例进行了重新加权,以使修订后的样本分布更紧密地与目标上下文匹配。我们已在跨生物环境下对14种不同TF上的TFBS鉴定进行了实验验证的TSK。我们发现,TSK始终优于最新的TFBS工具,尤其是在处理绑定序列在上下文中不保守的TF时。我们还通过显示TSK在MHC肽结合预测的一组不同的跨上下文任务上的尖端性能来证明其可推广性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号