首页> 美国卫生研究院文献>OMICS : a Journal of Integrative Biology >Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats
【2h】

Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats

机译:面向使用Y短串联重复序列进行大规模比较基因分型和亲缘关系分析的聚类应用程序的开发

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Y-chromosome short tandem repeats (Y-STRs) are genetic markers with practical applications in human identification. However, where mass identification is required (e.g., in the aftermath of disasters with significant fatalities), the efficiency of the process could be improved with new statistical approaches. Clustering applications are relatively new tools for large-scale comparative genotyping, and the k-Approximate Modal Haplotype (k-AMH), an efficient algorithm for clustering large-scale Y-STR data, represents a promising method for developing these tools. In this study we improved the k-AMH and produced three new algorithms: the Nk-AMH I (including a new initial cluster center selection), the Nk-AMH II (including a new dominant weighting value), and the Nk-AMH III (combining I and II). The Nk-AMH III was the superior algorithm, with mean clustering accuracy that increased in four out of six datasets and remained at 100% in the other two. Additionally, the Nk-AMH III achieved a 2% higher overall mean clustering accuracy score than the k-AMH, as well as optimal accuracy for all datasets (0.84–1.00). With inclusion of the two new methods, the Nk-AMH III produced an optimal solution for clustering Y-STR data; thus, the algorithm has potential for further development towards fully automatic clustering of any large-scale genotypic data.
机译:Y染色体短串联重复序列(Y-STR)是遗传标记,在人类识别中具有实际应用。但是,在需要进行大规模识别的情况下(例如,在重大伤亡灾难之后),可以通过新的统计方法来提高流程的效率。聚类应用是用于大规模比较基因分型的相对较新的工具,k-近似模态单倍型(k-AMH)是一种用于对大型Y-STR数据进行聚类的有效算法,代表了开发这些工具的一种有前途的方法。在这项研究中,我们改进了k-AMH并产生了三种新算法:Nk-AMH I(包括新的初始聚类中心选择),Nk-AMH II(包括新的主要加权值)和Nk-AMH III (结合I和II)。 Nk-AMH III是更好的算法,平均聚类准确性在六个数据集中的四个中增加,而在其他两个数据集中则保持在100%。此外,Nk-AMH III的总体平均聚类准确度得分比k-AMH高2%,并且所有数据集的最佳准确度(0.84–1.00)。结合了这两种新方法,Nk-AMH III为聚类Y-STR数据提供了最佳解决方案。因此,该算法有潜力进一步发展为任何大规模基因型数据的全自动聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号