首页> 外文学位 >Proximity based association rules for spatial data mining in genomes.
【24h】

Proximity based association rules for spatial data mining in genomes.

机译:基于邻近度的关联规则,用于基因组中的空间数据挖掘。

获取原文
获取原文并翻译 | 示例

摘要

Our knowledge discovery algorithm employs a combination of association rule mining and graph mining to identify frequent spatial proximity relationships in genomic data where the data is viewed as a one-dimensional space. We apply mining techniques and metrics from association rule mining to identify frequently co-occurring features in genomes followed by graph mining to extract sets of co-occurring features.;Using a case study of ab initio repeat finding, we have shown that our algorithm, ProxMiner, can be successfully applied to identify weakly conserved patterns among features in genomic data. The application of pairwise spatial relationships increases the sensitivity of our algorithm while the use of a confidence threshold based on false discovery rate reduces the noise in our results. Unlike available defragmentation algorithms, ProxMiner discovers associations among ab initio repeat families to identify larger more complete repeat families. ProxMiner will increase the effectiveness of repeat discovery techniques for newly sequenced genomes where ab initio repeat finders are only able to identify partial repeat families.;In this dissertation, we provide two detailed examples of ProxMiner-discovered novel repeat families and one example of a known rice repeat family that has been extended by ProxMiner. These examples encompass some of the different types of repeat families that can be discovered by our algorithm. We have also discovered many other potentially interesting novel repeat families that can be further studied by biologists.;Keywords: association rule mining, spatial rules, repeat, defragmentation, graph mining, novel repeat regions, DNA
机译:我们的知识发现算法结合了关联规则挖掘和图形挖掘的功能,以识别基因组数据中频繁的空间邻近关系,其中数据被视为一维空间。我们应用关联规则挖掘中的挖掘技术和指标来识别基因组中频繁出现的共同特征,然后通过图挖掘来提取共同出现的特征集。通过对从头算重复发现的案例研究,我们证明了我们的算法, ProxMiner可以成功地用于识别基因组数据中特征之间的弱保守模式。成对的空间关系的应用提高了我们算法的灵敏度,而基于错误发现率的置信度阈值的使用降低了我们结果中的噪声。与可用的碎片整理算法不同,ProxMiner发现了从头开始重复序列家族之间的关联,以识别更大,更完整的重复序列家族。 ProxMiner将提高重复发现技术在新测序的基因组中的有效性,其中从头开始的重复发现者只能识别部分重复家族。在本文中,我们提供了ProxMiner发现的两个新颖重复家族的详细例子和一个已知例子。 ProxMiner扩展的水稻重复序列家族。这些示例包含了我们算法可以发现的某些不同类型的重复序列家族。我们还发现了许多其他可能有趣的新颖重复序列家族,可供生物学家进一步研究。;关键词:关联规则挖掘,空间规则,重复序列,碎片整理,图挖掘,新颖重复区域,DNA

著录项

  • 作者

    Saha, Surya.;

  • 作者单位

    Mississippi State University.;

  • 授予单位 Mississippi State University.;
  • 学科 Biology Bioinformatics.;Artificial Intelligence.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 153 p.
  • 总页数 153
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:38:19

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号