首页> 外文期刊>Genome Biology and Evolution >Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes
【24h】

Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes

机译:基于病毒基因组Roche / 454焦磷酸测序的进化研究中错误来源的实验分析

获取原文
获取外文期刊封面目录资料

摘要

Factors affecting the reliability of Roche/454 pyrosequencing for analyzing sequence polymorphism in within-host viral populations were assessed by two experiments: 1) sequencing four clonal simian immunodeficiency virus (SIV) stocks and 2) sequencing mixtures in different proportions of two SIV strains with known fixed nucleotide differences. Observed nucleotide diversity and frequency of undetermined nucleotides were increased at sites in homopolymer runs of four or more identical nucleotides, particularly at AT sites. However, in the mixed-strain experiments, the effects on estimated nucleotide diversity of such errors were small in comparison to known strain differences. The results suggest that biologically meaningful variants present at a frequency of around 10% and possibly much lower are easily distinguished from artifacts of the sequencing process. Analysis of the clonal stocks revealed numerous rare variants that showed the signature of purifying selection and that elimination of variants at frequencies of less than 1% reduced estimates of nucleotide diversity by about an order of magnitude. Thus, using a 1% frequency cutoff for accepting a variant as real represents a conservative standard, which may be useful in studies that are focused on the discovery of specific mutations (such as those conferring immune escape or drug resistance). On the other hand, if the goal is to estimate nucleotide diversity, an optimal strategy might be to include all observed variants (even those at less than 1% frequency), while masking out homopolymer runs of four or more nucleotides.
机译:通过两个实验评估了影响Roche / 454焦磷酸测序分析宿主内病毒种群序列多态性可靠性的因素:1)对四种克隆猿猴免疫缺陷病毒(SIV)种群进行测序,以及2)对两种SIV菌株按不同比例对混合物进行测序已知的固定核苷酸差异。在四个或更多相同核苷酸的均聚物运行中的位点,特别是在AT位点,观察到的核苷酸多样性和未确定核苷酸的频率增加。但是,在混合菌株实验中,与已知菌株差异相比,此类误差对估计核苷酸多样性的影响很小。结果表明,以大约10%甚至更低的频率存在的生物学上有意义的变体很容易与测序过程的伪影区分开。克隆种群的分析显示了许多罕见的变体,这些变体显示出纯化选择的特征,并且以小于1%的频率消除变体会使核苷酸多样性的估计值降低了大约一个数量级。因此,使用1%的频率截止频率接受真实的变体代表了一个保守的标准,这在专注于发现特定突变(例如那些赋予免疫逃逸或耐药性的突变)的研究中可能很有用。另一方面,如果目标是估计核苷酸多样性,则最佳策略可能是包括所有观察到的变体(甚至那些频率低于1%的变体),同时掩盖四个或更多核苷酸的均聚物序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号