首页> 外文会议>IEEE International Conference on Bioinformatics and Bioengineering >ENSPART: An Ensemble Framework Based on Data Partitioning for DNA Motif Analysis
【24h】

ENSPART: An Ensemble Framework Based on Data Partitioning for DNA Motif Analysis

机译:enspart:基于DNA主题分析数据分区的集合框架

获取原文

摘要

This paper proposes an ensemble approach based on data partitioning for large-scale DNA motif analysis. Motif prediction using genome-scale dataset is challenging due to high time and space complexity. Existing ensemble approaches, while demonstrated improve performances, are only applicable to small datasets. Our approach called ENSPART first partitions the input dataset into non-overlapping subsets which serve as input to multiple distinct motif prediction tools. It is assumed that the core motifs of a transcription factor protein exists in all data subsets. We employed seven motif prediction tools to obtain initial candidate motifs and they are merged according to their sequence content similarity. An alignment-free method is used to establish motif similarity. A novel motifs merging method is proposed to merge similar motifs obtained by tools in different data partitions. Ten genome-wide ChIP datasets are collected for evaluation. We compare our approach with MEME-ChIP and obtained improved results for nine out of ten of the datasets in terms of Area Under Curve (AUC). Most datasets obtained improved AUC value between 5 to 10%. Our approach shows the promising of data partitioning based ensemble approach for large-scale motif prediction.
机译:本文提出了一种基于大规模DNA主题分析数据分区的集合方法。由于高时间和空间复杂性,使用基因组规模数据集的主题预测是挑战。现有的集合方法,同时展示改善性能,仅适用于小型数据集。我们的方法称为enspart首先将输入数据集分区为非重叠子集,该子集用作多个不同的主题预测工具的输入。假设在所有数据子集中存在转录因子蛋白的核心基序。我们使用七个主题预测工具来获得初始候选主题,并根据其序列内容相似合并。使用对齐的方法用于建立基序相似性。提出了一种新颖的主题融合方法来合并通过不同数据分区中的工具获得的类似主题。收集10个基因组芯片数据集进行评估。我们将我们的方法与MEME芯片进行比较,并在曲线下的区域(AUC)的区域方面获得了十个数据集的9个结果。大多数数据集在5到10%之间获得改善的AUC值。我们的方法显示了基于数据划分的基于集合方法的大规模主题预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号