首页> 外文期刊>Pattern recognition letters >The tetratricopeptide repeats (TPR)-like superfamily of proteins in Leishmania spp., as revealed by multi-relational data mining
【24h】

The tetratricopeptide repeats (TPR)-like superfamily of proteins in Leishmania spp., as revealed by multi-relational data mining

机译:多重关系数据挖掘揭示了利什曼原虫中蛋白质的四三肽重复(TPR)样超家族

获取原文
获取原文并翻译 | 示例

摘要

Protein sequence analysis tasks are multi-relational problems suitable for multi-relational data mining (MRDM). Proteins containing tetratricopeptide (TPR), pentatricopeptide (PPR) and half-a-TPR (HAT) repeats comprise the TPR-like superfamily in which we have applied MRDM methods (relational association rule discovery and probabilistic relational models) with hidden Markov models (HMMs) and Viterbi algorithm (VA) in genome databases of pathogenic protozoa Leishmania. Such integrated MRDM/HMM/ VA approach seeks to capture as much model information as possible in the pattern matching heuristic, without resorting to more standard motif discovery methods (Pfam, SMART, SUPERFAMILY) and it has the advantage of incorporation of optimized profiles, score offsets and distribution to compute probability, as a more recently reported tool (TPRpred) in order to take in account the tendency of repeats to occur in tandem and to be widely distributed along the sequences. Here we compare such currently available resources with our approach (MRDM/HMM/VA) to highlight that the latter performs best into the TPR-like superfamily assignment and it might be applied to other sequence analysis problems in such a way that it contributes to tight-fit motif discoveries and a better probability that a given target sequence is, indeed, a target motif-containing protein.
机译:蛋白质序列分析任务是适用于多关系数据挖掘(MRDM)的多关系问题。包含四肽(TPR),五肽(PPR)和半α-TPR(HAT)重复序列的蛋白质构成了TPR样超家族,我们在其中应用了MRDM方法(关系关联规则发现和概率关系模型)和隐马尔可夫模型(HMM)和致病性原生动物利什曼原虫基因组数据库中的维特比算法(VA)。这种集成的MRDM / HMM / VA方法试图在模式匹配试探法中捕获尽可能多的模型信息,而不求助于更多标准的基序发现方法(Pfam,SMART,SUPERFAMILY),并且具有合并优化配置文件,得分的优势。补偿和分布以计算概率,作为最近报告的工具(TPRpred),目的是考虑重复发生的趋势,即串联发生并沿序列广泛分布。在这里,我们将这种当前可用的资源与我们的方法(MRDM / HMM / VA)进行比较,以强调后者在类似TPR的超家族分配中表现最佳,并且可能以有助于紧密的方式应用于其他序列分析问题合适的基序发现,以及给定的目标序列确实是含有目标基序的蛋白质的可能性更高。

著录项

  • 来源
    《Pattern recognition letters》 |2010年第14期|P.2178-2189|共12页
  • 作者单位

    Nucleo Tarcisio Pimenta de Pesquisa Genomica e Bioinformatica, NUGEN, Faculdade de Veterinaria, Universidade Estadual do Ceara - UECE, Av. Pamnjana, 1700, Campus do Itaperi, Fortaleza. CE 60740-000, Brazil;

    rnNucleo Tarcisio Pimenta de Pesquisa Genomica e Bioinformatica, NUGEN, Faculdade de Veterinaria, Universidade Estadual do Ceara - UECE, Av. Pamnjana, 1700, Campus do Itaperi, Fortaleza. CE 60740-000, Brazil;

    rnNucleo Tarcisio Pimenta de Pesquisa Genomica e Bioinformatica, NUGEN, Faculdade de Veterinaria, Universidade Estadual do Ceara - UECE, Av. Pamnjana, 1700, Campus do Itaperi, Fortaleza. CE 60740-000, Brazil;

    rnNucleo Tarcisio Pimenta de Pesquisa Genomica e Bioinformatica, NUGEN, Faculdade de Veterinaria, Universidade Estadual do Ceara - UECE, Av. Pamnjana, 1700, Campus do Itaperi, Fortaleza. CE 60740-000, Brazil;

    rnNucleo Tarcisio Pimenta de Pesquisa Genomica e Bioinformatica, NUGEN, Faculdade de Veterinaria, Universidade Estadual do Ceara - UECE, Av. Pamnjana, 1700, Campus do Itaperi, Fortaleza. CE 60740-000, Brazil;

    rnNucleo Tarcisio Pimenta de Pesquisa Genomica e Bioinformatica, NUGEN, Faculdade de Veterinaria, Universidade Estadual do Ceara - UECE, Av. Pamnjana, 1700, Campus do Itaperi, Fortaleza. CE 60740-000, Brazil;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    multi-relational data mining; hidden markov models; viterbi algorithm; tetratricopeptide repeat motif; leishmania proteins;

    机译:多关系数据挖掘;隐藏的马尔可夫模型;维特比算法;四三肽重复基序利什曼原虫蛋白;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号