首页> 外文期刊>Marine Mammal Science >Diagnosability of mtDNA with Random Forests: Using sequence data to delimit subspecies
【24h】

Diagnosability of mtDNA with Random Forests: Using sequence data to delimit subspecies

机译:MTDNA与随机林的诊断:使用序列数据分隔亚种

获取原文
获取原文并翻译 | 示例
       

摘要

We examine the use of an ensemble method, Random Forests, to delimit subspecies using mitochondrial DNA (mtDNA) sequences. Diagnosability, a measure of the ability to correctly determine the taxon of a specimen of unknown origin, has historically been used to delimit subspecies, but few studies have explored how to estimate it from DNA sequences. Using simulated and empirical data sets, we demonstrate that Random Forests produces classification models that perform well for diagnosing subspecies and species. Populations with strong social structure and relatively low abundances (e.g., killer whales, Orcinus orca) were found to be as diagnosable as species. Conversely, comparisons involving subspecies that are abundant (e.g., spinner and spotted dolphins, Stenella longirostris and S. attenuata), are only as diagnosable as many population comparisons. Estimates of diagnosability reported in subspecies and species descriptions should include confidence intervals, which are influenced by the sample sizes of the training data. We also stress the importance of reporting the certainty with which individuals in the training data are classified in order to communicate the strength of the classification model and diagnosability estimate. Guidance as to ideal minimum diagnosability thresholds for subspecies will improve with more comprehensive analyses; however, values in the range of 80%-90% are considered appropriate.
机译:None

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号