首页> 外文会议>International Symposium on Foundations of Intelligent Systems >A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries
【24h】

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries

机译:一种机器文本启发的机器学习方法,用于识别跨膜螺旋边界

获取原文

摘要

In this paper, we adapt a statistical learning approach, inspired by automated topic segmentation techniques in speech-recognized documents to the challenging protein segmentation problem in the context of G-protein coupled receptors (GPCR). Each GPCR consists of 7 transmembrane helices separated by alternating extracellular and intracellular loops. Viewing the helices and extracellular and intracellular loops as 3 different topics, the problem of segmenting the protein amino acid sequence according to its secondary structure is analogous to the problem of topic segmentation. The method presented involves building an n-gram language model for each 'topic' and comparing their performance in predicting the current amino acid, to determine whether a boundary occurs at the current position. This presents a distinctly different approach to protein segmentation from the Markov models that have been used previously and its commendable results is evidence of the benefit of applying machine learning and language technologies to bioinformatics.
机译:在本文中,我们适应统计学习方法,通过语音识别的文献中的自动化题目分割技术激发了在G蛋白偶联受体(GPCR)的上下文中的挑战性蛋白质分段问题。每个GPCR由通过交替的细胞外和细胞内环分开的7个跨膜螺旋组成。观察螺旋和细胞外和细胞内循环作为3个不同的主题,根据其二级结构分割蛋白质氨基酸序列的问题类似于主题分割的问题。呈现的方法涉及为每个“主题”构建N-GRAM语言模型,并比较它们在预测当前氨基酸时的性能,以确定是否在当前位置发生边界。这提出了从前使用的Markov模型的明显不同的蛋白质细分方法,其值得称道的结果是有益于将机器学习和语言技术应用于生物信息学的迹象。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号