首页> 外文会议>IEEE International Conference on Bioinformatics and Bioengineering >ENSPART: An Ensemble Framework Based on Data Partitioning for DNA Motif Analysis
【24h】

ENSPART: An Ensemble Framework Based on Data Partitioning for DNA Motif Analysis

机译:ENSPART:基于数据分区的DNA图案分析集成框架

获取原文

摘要

This paper proposes an ensemble approach based on data partitioning for large-scale DNA motif analysis. Motif prediction using genome-scale dataset is challenging due to high time and space complexity. Existing ensemble approaches, while demonstrated improve performances, are only applicable to small datasets. Our approach called ENSPART first partitions the input dataset into non-overlapping subsets which serve as input to multiple distinct motif prediction tools. It is assumed that the core motifs of a transcription factor protein exists in all data subsets. We employed seven motif prediction tools to obtain initial candidate motifs and they are merged according to their sequence content similarity. An alignment-free method is used to establish motif similarity. A novel motifs merging method is proposed to merge similar motifs obtained by tools in different data partitions. Ten genome-wide ChIP datasets are collected for evaluation. We compare our approach with MEME-ChIP and obtained improved results for nine out of ten of the datasets in terms of Area Under Curve (AUC). Most datasets obtained improved AUC value between 5 to 10%. Our approach shows the promising of data partitioning based ensemble approach for large-scale motif prediction.
机译:本文提出了一种基于数据分区的大规模DNA基序分析的集成方法。由于时间和空间的复杂性,使用基因组规模的数据集进行母题预测具有挑战性。现有的集成方法虽然已证明可以提高性能,但仅适用于小型数据集。我们称为ENSPART的方法首先将输入数据集划分为不重叠的子集,这些子集用作多个不同的图案预测工具的输入。假定转录因子蛋白的核心基序存在于所有数据子集中。我们使用了七个基序预测工具来获得初始候选基序,并根据它们的序列内容相似性将它们合并。无比对方法用于建立基序相似性。提出了一种新颖的图案合并方法,以合并通过工具在不同数据分区中获得的相似图案。收集了十个全基因组ChIP数据集以进行评估。我们将我们的方法与MEME-ChIP进行了比较,就曲线下面积(AUC)而言,十分之九的数据集获得了改进的结果。大多数数据集获得的改进的AUC值在5%到10%之间。我们的方法表明基于数据分区的集成方法可用于大规模主题预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号