提取4个不同来源的烟草马铃薯Y病毒完整基因组的统计特征,并对它们进行聚类分析.在烟草马铃薯Y病毒完整基因组的碱基序列上,用每个碱基及其随后两个碱基所构成的三碱基组,排列成一个新的序列S,计算所有64种不同三碱基组在S上出现的概率,得到一个64维向量L:比较各个基因组的L向量,得到4个三碱基组(CAA、GAT、GTA、GAC),它们的概率有明显的差异.这4个三碱基组的出现概率与烟草马铃薯Y病毒基因组的遗传变异有着重要关联:4个不同来源的烟草马铃薯Y病毒完整基因组,按其遗传变异结果,形成两个大类.%The statistical characteristics of the complete genome of 4 potato virus Y (PVY) with different resources were extracted and cluster analyzed. A new sequence S was arranged by the three-base groups composing every base and its following two bases in PVY complete genome. And then a 64-dimensional vector L was obtained by caculating the appearance probability of each of the 64 three-base-groups. 4-three-base-groups(CAA,GAT,GTA,GAC) whose appearance probability was great different was identified by comparing L vector of every genome. The appearance probability of these four threebase-groups has great ralations with genetic variation of PVY. And the 4 complete genome of PVY was clustered into two groups according to the result of genetic variation.
展开▼