首页> 外国专利> Markovian domain fingerprinting in statistical segmentation of protein sequences

Markovian domain fingerprinting in statistical segmentation of protein sequences

机译:蛋白质序列的统计分割中的马尔可夫域指纹

摘要

Apparatus for automatic segmentation of non-aligned data sequences comprising structural domains to identify and construct models of the structural domains. The apparatus comprises a soft clustering unit, a refinement unit and an annealing unit. The soft clustering unit iteratively partitions the data sequences and trains variable memory Markov sources, created using a prediction suffix tree data structure, on the data until convergence is reached. The clustering unit also eliminates sources showing low relationships with the data. The refinement unit is connected to the soft clustering unit and splits and perturbs the sources following convergence, to repeat the iterative partitioning at the soft clustering unit, thereby to refine the model. The annealing unit increases the resolution with which the relationships between data and sources is shown, thereby governing the way in which less competitive sources are rejected, and the apparatus outputs the surviving variable memory Markov sources to provide models for subsequent identification of the structural domains.
机译:自动分割包括结构域的未比对数据序列的装置,以识别和构建结构域的模型。该设备包括软聚类单元,精制单元和退火单元。软聚类单元对数据序列进行迭代分区,并在数据上训练使用预测后缀树数据结构创建的可变内存马尔可夫源,直到达到收敛为止。聚类单元还消除了与数据显示低关联的源。细化单元连接到软聚类单元,并且在收敛之后分裂和扰动源,以在软聚类单元上重复迭代划分,从而细化模型。退火单元提高了显示数据与源之间的关系的分辨率,从而控制了拒绝竞争性较弱的源的方式,并且该设备输出了尚存的可变内存马尔可夫源,以提供用于后续结构域识别的模型。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号