首页> 美国卫生研究院文献>BMC Bioinformatics >A statistical framework for detecting mislabeled and contaminated samples using shallow-depth sequence data

【2h】

A statistical framework for detecting mislabeled and contaminated samples using shallow-depth sequence data

机译：一个使用浅深度序列数据检测错误标记和污染样品的统计框架

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

BackgroundResearchers typically sequence a given individual multiple times, either re-sequencing the same DNA sample (technical replication) or sequencing different DNA samples collected on the same individual (biological replication) or both. Before merging the data from these replicate sequence runs, it is important to verify that no errors, such as DNA contamination or mix-ups, occurred during the data collection pipeline. Methods to detect such errors exist but are often ad hoc, cannot handle missing data and several require phased data. Because they require some combination of genotype calling, imputation, and haplotype phasing, these methods are unsuitable for error detection in low- to moderate-depth sequence data where such tasks are difficult to perform accurately. Additionally, because most existing methods employ a pairwise-comparison approach for error detection rather than joint analysis of the putative replicates, results may be difficult to interpret.

机译：背景研究人员通常对给定的个体进行多次测序，或者对同一DNA样本重新测序（技术复制），或者对在同一个体上收集的不同DNA样本进行测序（生物复制），或者对两者进行测序。在合并来自这些复制序列运行的数据之前，重要的是要验证在数据收集管道中没有发生任何错误，例如DNA污染或混淆。存在检测此类错误的方法，但这些方法通常是临时性的，无法处理丢失的数据，并且有些方法需要分阶段的数据。由于它们需要基因型调用，插补和单倍型定相的某种组合，因此这些方法不适用于难以准确执行此类任务的中低深度序列数据中的错误检测。此外，由于大多数现有方法采用成对比较方法进行错误检测，而不是对假定的重复项进行联合分析，因此结果可能难以解释。

著录项

期刊名称 BMC Bioinformatics
作者
Ariel W. Chan; Amy L. Williams; Jean-Luc Jannink;
展开▼
作者单位

展开▼
年(卷),期 2018(19),-1
年度 2018
页码 478
总页数 14
原文格式 PDF
正文语种
中图分类应用微生物学;生化遗传学;生化药理学;
关键词
Error detection Biological replication Technical replication Shallow-depth sequence data Mislabeled samples;

机译：错误检测;生物复制;技术复制;浅层序列数据;标签错误的样品;

相似文献

外文文献
中文文献
专利

1. A statistical framework for detecting mislabeled and contaminated samples using shallow-depth sequence data [J] . Ariel W. Chan, Amy L. Williams, Jean-Luc Jannink BMC Bioinformatics . 2018,第1期

机译：使用浅深度序列数据检测错误标记和受污染样本的统计框架
2. MBV: a method to solve sample mislabeling and detect technical bias in large combined genotype and sequencing assay datasets [J] . Bioinformatics . 2017,第12期

机译：MBV：一种解决样品误标表的方法并检测大型组合基因型和测序测定数据集的技术偏差
3. A flexible likelihood framework for detecting associations with secondary phenotypes in genetic studies using selected samples: Application to sequence data [J] . LiuD.J., LealS.M. European journal of human genetics: EJHG . 2012,第4期

机译：一个灵活的似然框架，用于在使用选定样本的遗传研究中检测与次级表型的关联：应用于序列数据
4. Statistical Methods for Detecting Latent Periodicity in Biological Sequences: Solving a Problem of Small-Size Samples [C] . Chaley M., Nazipova N., Teplukhina E., Bioinformatics and Biomedicine, 2009. BIBM '09 . 2009

机译：检测生物序列中潜伏期的统计方法：解决小样本问题
5. MISSING VALUES IN STATISTICAL ANALYSIS. (MODIFIED SAMPLING DISTRIBUTIONS,APPROXIMATE STATISTICAL ANALYSIS OF EXPERIMENTAL DATA AND ESTIMATION OF POPULATION PARAMETERS FROM FRAGMENTARY SAMPLES [D] . MATHAI, MATHAI ARAKAPARAMPIL. 1964

机译：统计分析中的缺失值。修改后的抽样分布，实验数据的近似统计分析和片段样本的人口参数估计
6. MBV: a method to solve sample mislabeling and detect technical bias in large combined genotype and sequencing assay datasets [O] . Alexandre Fort, Nikolaos I Panousis, Marco Garieri, -1

机译：MBV：一种解决样品错误标签和检测大型组合基因型和测序分析数据集中技术偏见的方法
7. reGenotyper: Detecting mislabeled samples in genetic data [O] . Zych, Konrad, Snoek, Basten L., Elvin, Mark, 2017

机译：reGenotyper：检测遗传数据中标记错误的样品

A statistical framework for detecting mislabeled and contaminated samples using shallow-depth sequence data

摘要

著录项

相似文献

相关主题

期刊订阅