首页> 中文期刊>计算机科学 >基于特征挖掘的基因组缺失变异集成检测方法

基于特征挖掘的基因组缺失变异集成检测方法

     

摘要

随着高通量测序技术的应用与发展,基于测序的缺失变异检测方法大量涌现.然而,单一检测方法仍存在适用的局限性以及检测精度与敏感度不足的问题.为此,提出一种基于多检测理论融合的特征挖掘与机器学习算法集成的基因组缺失变异综合检测方法.该方法将多种工具应用于个体缺失变异检测,得到变异检测初始集;再根据多种检测理论对初始集中的缺失变异进行序列特征挖掘与特征提取;最后,将检测工具与机器学习算法相融合以获得集成的检测方法,剔除初始集中的假阳性变异,获得最终的结果集.基于千人基因组计划数据的实验表明,相较于单个工具的检测结果,该方法在检测精度和敏感度上均占优势;相较于多个工具检测结果的直接组合,该方法在损失少许检测敏感度的前提下显著地提高了检测精度.%With the application and development of next generation sequencing technology,methods of calling genomic deletions based on sequencing have proliferated.However,using a single method to call deletions has limitation in application and insufficiency of precision and sensitivity.To solve these problems,an integrated approach for calling deletions was proposed based on feature mining according to combining multiple theory and machine learning algorithm.First,different callers are used for calling deletions.These results are merged as aninitial result set of deletions.Then,according to variety of detection strategies,features of the initial result set of deletions are extracted based on next generation sequencing data.Finally,to obtain the final result set of calling deletions,a machine learning model is trained to distinguish false positive deletions from initial call set.The experimental results show that compared with a single caller such as Pindel and SVseq2,the proposed approach has higher precision and sensitivity simultaneously.Compared with directly merging multiple deletion call sets,the proposed approach can significantly improve the precision with slight loss of sensitivity.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号