Population clustering based on copy number variations detected from next generation sequencing data

Junbo Duan; Ji-Gang Zhang; Mingxi Wan; Hong-Wen Deng; Yu-Ping Wang

首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Population clustering based on copy number variations detected from next generation sequencing data

【24h】

Population clustering based on copy number variations detected from next generation sequencing data

机译：基于从下一代测序数据中检测到的拷贝数变异的群体聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Copy number variations (CNVs) can be used as significant biomarkers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering.

机译：拷贝数变异（CNV）可用作重要的生物标志物，下一代测序（NGS）提供了这些CNV的高分辨率检测。但是，如何从CNV中提取特征并将其进一步应用于诸如种群聚类的基因组研究已成为一个巨大的挑战。在本文中，我们提出了一种基于NGS中CNV的人口聚类的新方法。首先，从每个样本中提取CNV，以形成特征矩阵。然后，使用非负矩阵分解（NMF）将此特征矩阵分解为源矩阵和权重矩阵。源矩阵由同一组中所有样本共享的公用CNV组成，权重矩阵指示每个样本中CNV的相应水平。因此，使用CNV的NMF可以区分来自不同种族的样本，即人口聚类。为了验证该方法，我们将其应用于分析模拟数据和1000个基因组计划中的两个真实数据集。仿真数据结果表明，该方法可以高质量地恢复真实的普通CNV。第一次真实数据分析的结果表明，该方法可以将两个具有不同祖先的家庭三人聚类为两个族群，第二次真实数据分析的结果表明，该方法可以应用于大样本全基因组大小由多个组组成。两项结果都证明了所提出的人口聚类方法的潜力。

著录项

来源
《Journal of Bioinformatics and Computational Biology》 |2014年第4期|共18页
作者
Junbo Duan; Ji-Gang Zhang; Mingxi Wan; Hong-Wen Deng; Yu-Ping Wang;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类细胞生物学;
关键词
Next generation sequencing; copy number variations; non-negative matrix factorization; 1000 Genomes Project.;

机译：下一代测序;拷贝数变异;非负矩阵分解;1000个基因组计划。;

相似文献

外文文献
中文文献
专利

1. Population clustering based on copy number variations detected from next generation sequencing data [J] . Junbo Duan, Ji-Gang Zhang, Mingxi Wan, Journal of Bioinformatics and Computational Biology . 2014,第4期

机译：基于从下一代测序数据中检测到的拷贝数变异的群体聚类
2. Data-driven approach to detect common copy-number variations and frequency profiles in a population-based Korean cohort. [J] . Moon S, Kim YJ, Hong CB, European journal of human genetics: EJHG . 2011,第11期

机译：数据驱动的方法，用于检测以人口为基础的韩国人群中常见的拷贝数变异和频率分布。
3. Next-generation sequencing verified by multiplex ligation-dependent probe amplification to detect a new copy number variations in a child with heterozygous familial hypercholesterolemia [J] . Hui Yan, Jian-Hui Qiu, Yi-Nan Ma, 中华医学杂志（英文版） . 2021,第007期

机译：通过多重连接依赖性探针扩增验证的下一代测序，以检测杂合族家族性高胆固醇血症的新拷贝数变异
4. Detection of common copy number variation with application to population clustering from next generation sequencing data [C] . Junbo Duan, Ji-Gang Zhang, Hong-Wen Deng, Annual International Conference of the IEEE Engineering in Medicine and Biology Society . 2012

机译：从下一代测序数据中的应用到群体群集的常见拷贝数变型的检测
5. Detect copy number variations from read-depth of high-throughput sequencing data [D] . Wang, Weibo 2015

机译：从高通量测序数据的读取深度检测拷贝数变化
6. Population clustering based on copy number variations detected from next generation sequencing data [O] . Junbo Duan, Ji-Gang Zhang, Mingxi Wan, -1

机译：基于从下一代测序数据中检测到的拷贝数变异的群体聚类
7. CNV-CH: A Convex Hull Based Segmentation Approach to Detect Copy Number Variations (CNV) Using Next-Generation Sequencing Data. [O] . Rituparna Sinha, Sandip Samaddar, Rajat K De 2015

机译：CNV-CH：基于凸壳的分割方法，使用下一代测序数据检测拷贝数变异（CNV）。

Population clustering based on copy number variations detected from next generation sequencing data

摘要

著录项

相似文献

相关主题

期刊订阅