首页> 外文期刊>Bioinformatics >Cascade detection for the extraction of localized sequence features; specificity results for HIV-1 protease and structure-function results for the Schellman loop
【24h】

Cascade detection for the extraction of localized sequence features; specificity results for HIV-1 protease and structure-function results for the Schellman loop

机译:级联检测,用于提取局部序列特征; HIV-1蛋白酶的特异性结果和Schellman环的结构功能结果

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: The extraction of the set of features most relevant to function from classified biological sequence sets is still a challenging problem. A central issue is the determination of expected counts for higher order features so that artifact features may be screened. Results: Cascade detection (CD), a new algorithm for the extraction of localized features from sequence sets, is introduced. CD is a natural extension of the proportional modeling techniques used in contingency table analysis intothe domain of feature detection. The algorithm is successfully tested on synthetic data and then applied to feature detection problems from two different domains to demonstrate its broad utility. An analysis of HIV-1 protease specificity reveals patternsof strong first-order features that group hydrophobic residues by side chain geometry and exhibit substantial symmetry about the cleavage site. Higher order results suggest that favorable cooperativity is weak by comparison and broadly distributed, butindicate possible synergies between negative charge and hydrophobicity in the substrate. Structure-function results for the Schellman loop, a helix-capping motif in proteins, contain strong first-order features and also show statistically significant cooperativities that provide new insights into the design of the motif. These include a new 'hydrophobic staple' and multiple amphipathic and electrostatic pair features. CD should prove useful not only for sequence analysis, but also for the detection of multifactor synergies in cross-classified data from clinical studies or other sources.
机译:动机:从分类的生物序列集中提取与功能最相关的特征集仍然是一个难题。一个中心问题是确定高阶特征的期望计数,以便可以筛选出伪像特征。结果:引入了级联检测(CD),这是一种从序列集中提取局部特征的新算法。 CD是列联表分析中使用的比例建模技术到特征检测领域的自然扩展。该算法已在合成数据上成功测试,然后应用于来自两个不同领域的特征检测问题,以证明其广泛的实用性。对HIV-1蛋白酶特异性的分析揭示了具有强大的一级特征的模式,该模式通过侧链几何结构对疏水残基进行分组,并在切割位点上显示出基本的对称性。更高阶的结果表明,相比而言,有利的协作性较弱且分布广泛,但表明底物中负电荷和疏水性之间可能存在协同作用。 Schellman环的结构功能结果(蛋白质中的螺旋帽基序)具有很强的一阶特征,并且还显示出统计学上显着的合作性,为该基序的设计提供了新的见识。这些功能包括新的“疏水钉书钉”以及多个两亲和静电对功能。 CD应该不仅证明对序列分析有用,而且对检测来自临床研究或其他来源的交叉分类数据中的多因素协同作用也很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号