首页> 美国卫生研究院文献>Frontiers in Genetics >Semi-supervised spectral clustering with application to detect population stratification
【2h】

Semi-supervised spectral clustering with application to detect population stratification

机译:半监督谱聚类及其在人口分层检测中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In genetic association studies, unaccounted population stratification can cause spurious associations in a discovery process of identifying disease-associated genetic markers. In such a situation, prior information is often available for some subjects' population identities. To leverage the additional information, we propose a semi-supervised clustering approach for detecting population stratification. This approach maintains the advantages of spectral clustering, while is integrated with the additional identity information, leading to sharper clustering performance. To demonstrate utility of our approach, we analyze a whole-genome sequencing dataset from the 1000 Genomes Project, consisting of the genotypes of 607 individuals sampled from three continental groups involving 10 subpopulations. This is compared against a semi-supervised spectral clustering method, in addition to a spectral clustering method, with the known subpopulation information by the Rand index and an adjusted Rand (ARand) index. The numerical results suggest that the proposed method outperforms its competitors in detecting population stratification.
机译:在遗传关联研究中,无法确定的人群分层可能会在发现与疾病相关的遗传标记的发现过程中导致虚假关联。在这种情况下,通常可以获取某些受试者的人口身份的先验信息。为了利用附加信息,我们提出了一种半监督聚类方法来检测人口分层。这种方法保留了频谱聚类的优点,同时与其他身份信息集成在一起,从而提高了聚类性能。为了证明我们方法的实用性,我们分析了来自1000个基因组计划的全基因组测序数据集,该数据集由从10个亚群的三个大陆组采样的607个个体的基因型组成。将其与除谱聚类方法之外的半监督谱聚类方法进行比较,该方法通过Rand索引和调整后的Rand(ARand)索引获得已知的子种群信息。数值结果表明,该方法在检测人口分层方面优于竞争对手。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号