首页> 外文期刊>Bioinformatics >Estimating abundances of retroviral insertion sites from DNA fragment length data
【24h】

Estimating abundances of retroviral insertion sites from DNA fragment length data

机译:从DNA片段长度数据估算逆转录病毒插入位点的丰度

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: The relative abundance of retroviral insertions in a host genome is important in understanding the persistence and pathogenesis of both natural retroviral infections and retroviral gene therapy vectors. It could be estimated from a sample of cells if only the host genomic sites of retroviral insertions could be directly counted. When host genomic DNA is randomly broken via sonication and then amplified, amplicons of varying lengths are produced. The number of unique lengths of amplicons of an insertion site tends to increase according to its abundance, providing a basis for estimating relative abundance. However, as abundance increases amplicons of the same length arise by chance leading to a nonlinear relation between the number of unique lengths and relative abundance. The difficulty in calibrating this relation is compounded by sample-specific variations in the relative frequencies of clones of each length.Results: A likelihood function is proposed for the discrete lengths observed in each of a collection of insertion sites and is maximized with a hybrid expectation-maximization algorithm. Patient data illustrate the method and simulations show that relative abundance can be estimated with little bias, but that variation in highly abundant sites can be large. In replicated patient samples, variation exceeds what the model implies-requiring adjustment as in Efron (2004) or using jackknife standard errors. Consequently, it is advantageous to collect replicate samples to strengthen inferences about relative abundance.
机译:动机:宿主基因组中逆转录病毒插入的相对丰度对于理解天然逆转录病毒感染和逆转录病毒基因治疗载体的持久性和发病机理很重要。如果仅可以直接计数逆转录病毒插入的宿主基因组位点,则可以从细胞样本中进行估算。当宿主基因组DNA通过超声处理随机断裂,然后进行扩增时,会产生不同长度的扩增子。插入位点的独特长度的扩增子的数量往往根据其丰度而增加,从而为估算相对丰度提供了基础。但是,随着丰度增加,偶然出现相同长度的扩增子,从而导致独特长度的数量与相对丰度之间呈非线性关系。校准这种关系的困难是由于每种长度的克隆相对频率的样品特异性变化而加重的。结果:提出了一个似然函数,用于在每个插入位点集合中观察到的离散长度,并通过混合期望最大化-最大化算法。患者数据说明了该方法,模拟结果表明可以估计相对丰度而几乎没有偏差,但是高度丰满部位的变化可能很大。在复制的患者样本中,变异超出了模型所暗示的要求,如Efron(2004)所述需要进行调整或使用折刀标准误差。因此,收集重复样本以加强对相对丰度的推断是有利的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号