首页> 外文期刊>Online journal of veterinary research OJVR >Clustering dairy cattle genes by Kullback-Leibler divergence
【24h】

Clustering dairy cattle genes by Kullback-Leibler divergence

机译:Kullback-Leibler分歧的聚类乳制品牛基因

获取原文
获取原文并翻译 | 示例
       

摘要

Bio-computational grouping of genes facilitates genetic analysis, sequencing and structural-based analyses. DNA sequence of 30 genes involved with milk protein production were extracted ad hoc from NCBI genome database and stored in FASTA format. A Calgorithm base 2 to calculate Shannon entropy of gene DNA sequences was used to extract cluster genes governing milk production in dairy cows by Kullback-Leibler (KL) divergence. KL was based on nucleotide similarity (KLA), difference (KLB) and different order of Relative Entropy (KLH). AdaBoost algorithm was used to interpret clustering results. Examples of results: STX3(nnucleotide =79347) and CD14 (nnucleotide = 1417) were longest and shortest genes, respectively. 258 exons were identified whereinexon 1 of HSPA1A(nnucleotide =2101) and HSPA5(nnucleotide = 20) were longest and shortest. LCP1 and ABCG2 genes had highest number of exons (nexon=16) and HSPA1A and YWHAG(nexon = 1) had shortest number exons for this set of genes. Findings suggested that exons with maximum entropy value are likely to be suitable for genotype analysis using molecular markers and that both coding and non-coding sequences had low or high complexity. KL divergence can be used to cluster large sets of dairy cattle genes with other methods to group biologically relevant sets of genes.
机译:生物计算基因的分组促进遗传分析,测序和基于结构的分析。从NCBI基因组数据库中提取涉及牛奶蛋白生成的30个基因的DNA序列,并储存在Fasta格式中。用于计算基因DNA序列的Shannon熵的卡戈里氏菌属熵通过Kullback-Leibler(KL)分歧提取乳制品奶牛中的牛奶生产的簇基因。 KL基于核苷酸相似性(KLA),差异(KLB)和相对熵(KLH)的不同顺序。 adaboost算法用于解释聚类结果。结果的实例:STX3(Nnucerotide = 79347)和CD14(Nnucerotide = 1417)分别是最长,最短的基因。鉴定了Hspa1a(nncerotide = 2101)和hspa5(nncerotide = 20)的insinexon1的unithingxon1的含量最长,最短。 LCP1和ABCG2基因具有最多的外显子(Nexon = 16)和Hspa1a和Ywhg(Nexon = 1)对于该组基因具有最短的数量外显子。结果表明,具有最大熵值的外显子可能适用于使用分子标记的基因型分析,并且编码和非编码序列的复杂性低或高。 KL分歧可用于将大集的乳制品牛基因与其他方法进行聚类,以对生物相关的基因组进行生物学相关的基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号