首页> 外文会议>Proceedings of the Second conference on Asia-Pacific bioinformatics >Identifying character non-independence in phylogenetic data using data mining techniques
【24h】

Identifying character non-independence in phylogenetic data using data mining techniques

机译:使用数据挖掘技术识别系统发育数据中的字符非独立性

获取原文
获取原文并翻译 | 示例

摘要

Undiscovered relationships in a data set may confound analyses, particularly those that assume data independence. Such problems occur when characters used for phylogenetic analyses are not independent of one another. A main assumption of phylogenetic inference methods such as maximum likelihood and parsimony is that each character serves as an independent hypothesis of evolution. When this assumption is violated, the resulting phylogeny may not reflect true evolutionary history. Therefore, it is imperative that character non-independence be identified prior to phylogenetic analyses. To identify dependencies between phylogenetic characters, we applied three data mining techniques: 1) Bayesian networks, 2) decision tree induction, and 3) rule induction from coverings. We briefly discuss the main ideas behind each strategy, show how each technique performs on a small sample data set, and apply each method to an existing phylogenetic data set. We discuss the interestingness of the results of each method, and show that, although each method has its own strengths and weaknesses, rule induction from coverings presents the most useful solution for determining dependencies among phylogenetic data at this time.
机译:数据集中未发现的关系可能会混淆分析,尤其是那些假定数据独立的关系。当用于系统发育分析的字符彼此不独立时,就会出现此类问题。系统发育推断方法(例如最大似然和简约)的主要假设是,每个字符都可以作为进化的独立假设。如果违反了这一假设,则最终的系统发育可能无法反映真实的进化史。因此,必须在系统发育分析之前确定字符非独立性。为了确定系统发育特征之间的依赖性,我们应用了三种数据挖掘技术:1)贝叶斯网络,2)决策树归纳和3)覆盖规则归纳。我们简要讨论了每种策略背后的主要思想,展示了每种技术如何在较小的样本数据集上执行,并将每种方法应用于现有的系统发育数据集。我们讨论了每种方法的结果的趣味性,并表明,尽管每种方法都有其优点和缺点,但是从覆盖层进行规则归纳是目前确定系统发育数据之间相关性的最有用解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号