首页> 中文期刊> 《中国农业科学》 >三江黄牛全基因组数据分析

三江黄牛全基因组数据分析

         

摘要

[目的]研究三江黄牛群体遗传多样性,从基因组层面讨论其群体遗传变异情况.[方法]提取50个体基因组总DNA,等浓度等体积混合,构建混合样本DNA池,利用CovarisS2进行随机打断基因组DNA,电泳回收长度500 bp的DNA片段,构建DNA文库.应用Illumina HiSeq 2000测序,最终得到测序数据.利用BWA软件将短序列比对到牛参考基因组(UMD 3.1),来检测三江黄牛基因组突变情况.SAMtools、Picard-tools、GATK、Reseqtools对重测序数据进行分析,Ensembl、DAVID、dbSNP数据库对SNPs和indels进行注释.[结果]全基因组重测序分析共计得到77.8 Gb序列数据,测序深度为25.32×,覆盖率为99.31%.测序得到778403444个reads和77840344400个碱基,比对到参考基因组(UMD 3.1)的reads为673670505,碱基为67341451555,匹配率分别为86.55%和86.51%,成对比对上的reads数为635242898(81.61%),成对比对上的碱基数为63512636924(81.59%);共确定了20477130个SNPs位点和1355308个indels,其中2147988个SNPs(2.4%)和90180个indels(6.7%)是新发现的.总SNPs中,鉴定出纯合SNPs989686(4.83%),杂合SNPs19487444(95.17%),纯合/杂合SNP比为1﹕19.7.转换数为14800438个,颠换为6680058个,转换/颠换(TS/TV)为2.215.剪切位点突变SNP727个,开始密码子变非开始密码子SNP117个,提前终止密码子的SNP530个,终止密码子变非终止密码子SNP88个.检测到非同义突变数为57621,同义突变为83797,非同义/同义比率为0.69.检测到非同义SNPs分布在9017个基因上,其中发现567个基因与已报道的重要经济性状相符,肉质、抗病、产奶、生长性状、生殖等相关基因的数量分别为471、77、21、10、8个,其中包括功能相重叠的基因;indels数据中,缺失数量为693180(51.15%),插入数量为662148(48.85%),纯合indels数量为161198(11.89%),杂合indels数量1194110(88.11%),大部分的变异都位于基因间隔区和内含子区;三江黄牛全基因组杂合度(H)、核苷酸多样性(Pi)及theta W分别为7.6×10-3、0.0039、0.0040,说明其遗传多样性较为丰富.三江黄牛群体Tajima'D为-0.06832,推测可能由于群体内存在不平衡选择所致.[结论]本研究为进一步分析与经济性状相关的遗传学机制和保护三江黄牛品种遗传多样性提供了基因组数据支持.%[Objective] The objective of this paper is to study the genetic diversity of Sanjiang cattle group and discuss its genetic variation at the genome level.[Method]Fifty individual genomic DNA were extracted and mixed with isocratic and equal volumes, then the DNA pool of the mixed samples were constructed. Genomic DNA was interrupted randomly by using CovarisS2 and the DNA fragments of 500 bp were recovered by electrophoresis, and DNA library was constructed at last. Finally, the sequencing data were obtained through the Illumina HiSeq 2000. The short reads were mapped to bovine reference genome (UMD 3.1) to detect the genomic mutations of Sanjiang cattle using BWA software. The analysis of the re-sequencing data was implemented using SAMtools, Picard-tools, GATK, Reseqtools, the SNPs and indels were annotated based on the Ensembl, DAVID and dbSNP database. [Result]A total of 77.8 Gb of sequence data were generated by whole-genome sequencing analysis, 99.31% of the reference genome sequence was covered with a mapping depth of 25.32-fold, 778403444 reads and 77840344400 bases were obtained, of which 673670505 reads and 67341451555 bases covered 86.55% and 86.51% of bovine reference genomes (UMD 3.1) respectively, paired-end reads mapping were 635242898 (81.61%), paired-end bases mapping were 63512636924 (81.59%). A total of 20477130 SNPs and 1355308 small indels were identified, of which 2147988 SNPs (2.4%) and 90180 (6.7%) indels were found to be new. Of the total number of SNPs, 989686 (4.83%) homozygous SNPs and 19487444 (95.17%) heterozygous SNPs were discovered, homozygous/heterozygous SNPs was 1﹕19.7. Transitions were 14800438, transversions were 6680058, transition/transversion (TS/TV) was 2.215. SNPs of splice site mutations were 727, the number of SNPs which the start codon converts into no stop codon were 117, SNPs of premature stop codon were 530, the number of SNPs which stop codon converts into no stop codon were 88. A total of 57621 non-synonymous SNPs and 83797 synonymous SNPs were detected, the ratio was 0.69. Non-synonymous SNPs were detected in 9017 genes, 567 genes were assigned as trait-associated genes, which included meat quality, disease resistance, milk production, growth rate, fecundity with the number of 471, 77, 21, 10, and 8 respectively, the function of some genes were overlap. In detection of indels, 693180 (51.15%) were deletions and 662148 (48.85%) were insertions, 161198 (11.89%) were homozygous and 1194110 (88.11%) were heterozygous. Most variations were located in intergenic regions and introns. Heterozygosity (H), nucleotide diversity (Pi) and theta W of Sanjiang cattle genome-wide were 7.6×10-3, 0.0039, 0.0040, respectively, which indicated that Sanjiang cattle have an abundant genetic diversity. The Tajima'D of Sanjiang cattle population was-0.06832, which speculated that the population exists an unbalanced selection.[Conclusion]Results of this research will provide valuable genomic data for further investigations of the genetic mechanisms underlying traits of interest and protection of Sanjiang cattle breeds genetic diversity.

著录项

  • 来源
    《中国农业科学》 |2017年第1期|183-194|共12页
  • 作者单位

    西南民族大学动物遗传育种学国家民委-教育部重点实验室;

    成都 610041;

    西南民族大学青藏高原研究院 成都 610041;

    西南民族大学动物遗传育种学国家民委-教育部重点实验室;

    成都 610041;

    西南民族大学青藏高原研究院 成都 610041;

    西南民族大学动物遗传育种学国家民委-教育部重点实验室;

    成都 610041;

    西南民族大学青藏高原研究院 成都 610041;

    西南民族大学动物遗传育种学国家民委-教育部重点实验室;

    成都 610041;

    西南民族大学青藏高原研究院 成都 610041;

    阿坝州畜牧科学研究所;

    四川汶川 623000;

    阿坝州畜牧科学研究所;

    四川汶川 623000;

    阿坝州畜牧工作站;

    四川汶川 623000;

    汶川县畜牧工作站;

    四川汶川 623000;

    汶川县畜牧工作站;

    四川汶川 623000;

  • 原文格式 PDF
  • 正文语种 chi
  • 中图分类
  • 关键词

    三江黄牛; 基因组; 第二代测序技术; SNP; indel;

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号