...
首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >RNA Search with Decision Trees and Partial Covariance Models
【24h】

RNA Search with Decision Trees and Partial Covariance Models

机译:带有决策树和偏协方差模型的RNA搜索

获取原文
获取原文并翻译 | 示例

摘要

The use of partial covariance models to search for RNA family members in genomic sequence databases is explored. The partial models are formed from contiguous subranges of the overall RNA family multiple alignment columns. A binary decision-tree framework is presented for choosing the order to apply the partial models and the score thresholds on which to make the decisions. The decision trees are chosen to minimize computation time subject to the constraint that all of the training sequences are passed to the full covariance model for final evaluation. Computational intelligence methods are suggested to select the decision tree since the tree can be quite complex and there is no obvious method to build the tree in these cases. Experimental results from seven RNA families shows execution times of 0.066-0.268 relative to using the full covariance model alone. Tests on the full sets of known sequences for each family show that at least 95 percent of these sequences are found for two families and 100 percent for five others. Since the full covariance model is run on all sequences accepted by the partial model decision tree, the false alarm rate is at least as low as that of the full model alone.
机译:探索了使用偏协方差模型在基因组序列数据库中搜索RNA家族成员的方法。部分模型由整个RNA家族多重比对柱的连续子范围形成。提出了一种二元决策树框架,用于选择应用部分模型的顺序和做出决策的得分阈值。选择决策树以使计算时间最短,这要受所有训练序列都传递到完整协方差模型以进行最终评估的约束。建议使用计算智能方法选择决策树,因为决策树可能非常复杂,并且在这些情况下没有明显的方法来构建决策树。七个RNA家族的实验结果表明,相对于单独使用完整协方差模型,执行时间为0.066-0.268。对每个家族的全部已知序列进行的测试表明,至少有95%的序列是在两个家族中发现的,而在100%的其他五个家族中被发现的。由于完全协方差模型是在部分模型决策树接受的所有序列上运行的,因此误报率至少与单独的完整模型一样低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号