首页> 外国专利> POLYMORPHIC GENE TYPING AND SOMATIC CHANGE DETECTION USING SEQUENCING DATA

POLYMORPHIC GENE TYPING AND SOMATIC CHANGE DETECTION USING SEQUENCING DATA

机译:利用序列数据进行多态性基因分型和体细胞变化检测

摘要

A system and method for determining the exact pair of alleles corresponding to polymorphic genes from sequencing data and for using the polymorphic gene information in formulating an immunogenic composition. Reads from a sequencing data set mapping to the target polymorphic genes in a canonical reference genome sequence, and reads mapping within a defined threshold of the target gene sequence locations are extracted from the sequencing data set. Additionally, all reads from the set data set are matched against a probe reference set, and those reads that match with a high degree of similarity are extracted. Either one, or a union of both these sets of extracted reads are included in a final extracted set for further analysis. Ethnicity of the individual may be inferred based on the available sequencing data which may then serve as a basis for assigning prior probabilities to the allele variants. The extracted reads are aligned to a gene reference set of all known allele variants. The allele variant that maximizes a first posterior probability or posterior probability derived score is selected as the first allele variant. A second posterior probability or posterior probability derived score is calculated for reads that map to one or more other allele variants and the first allele variant using a weighting factor. The allele that maximizes the second posterior probability or posterior probability score is selected as the second allele variant.;A system and method for identifying somatic changes in polymorphic loci using WES data. The exact pair of alleles corresponding to the polymorphic gene are determined as described using a normal or germline sample from an individual. A tumor or otherwise diseased sample is also retrieved from the individual and the corresponding WES data is generated. Reads corresponding to the polymorphic gene are extracted as described in the paragraph above. These reads are then aligned to the inferred pair of allele sequences. The alignment of the germline or normal reads to the inferred pair of alleles, along with the alignment of the tumor or diseased reads to the inferred pair of alleles are simultaneously used as inputs to somatic change detection algorithms to identify somatic changes with greater precision and sensitivity.
机译:一种用于从测序数据中确定与多态性基因相对应的精确等位基因对的系统和方法,以及用于将多态性基因信息用于配制免疫原性组合物的系统和方法。从测序数据集中读取到标准参考基因组序列中靶多态性基因的读段,并从测序数据集中提取目标基因序列位置的定义阈值内的读段映射。另外,将来自集合数据集的所有读数与探针参考集进行匹配,并提取那些具有高度相似性的读数。最终提取的集合中将包含这两个提取的读数中的一个或两个的并集,以进行进一步分析。可以基于可用的测序数据来推断个体的种族,然后可以将其用作将先前概率分配给等位基因变体的基础。提取的读数与所有已知等位基因变体的基因参考集比对。选择最大化第一后验概率或后验概率导出分数的等位基因变体作为第一等位基因变体。对于使用加权因子映射到一个或多个其他等位基因变体和第一等位基因变体的读段,计算第二后验概率或后验概率得出的分数。选择使第二后验概率或后验概率得分最大的等位基因作为第二等位基因变体。一种使用WES数据识别多态性基因座体细胞变化的系统和方法。如使用来自个体的正常或种系样品所述,确定对应于多态性基因的等位基因的确切对。还从个体中检索出肿瘤或其他患病样品,并产生了相应的WES数据。如上段所述,提取对应于多态性基因的读段。然后将这些读段与推断的等位基因序列对进行比对。将种系或正常读段与推断的等位基因对对齐,以及将肿瘤或患病读段与推断的等位基因对对齐,同时用作体细胞变化检测算法的输入,以更高的精度和灵敏度识别体细胞变化。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号