首页> 美国卫生研究院文献>PLoS Clinical Trials >Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans
【2h】

Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans

机译:纳入进化信息和功能域以识别人类中的RNA剪接因子。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identified. Furthermore, experimental methods for splicing factor identification are time-consuming and lab-intensive. Although many computational methods have been proposed for the identification of RNA-binding proteins, there exists no development that focuses on the identification of RNA-splicing related proteins so far. Therefore, we are motivated to design a method that focuses on the identification of human splicing factors using experimentally verified splicing factors. The investigation of amino acid composition reveals that there are remarkable differences between splicing factors and non-splicing proteins. A support vector machine (SVM) is utilized to construct a predictive model, and the five-fold cross-validation evaluation indicates that the SVM model trained with amino acid composition could provide a promising accuracy (80.22%). Another basic feature, amino acid dipeptide composition, is also examined to yield a similar predictive performance to amino acid composition. In addition, this work presents that the incorporation of evolutionary information and domain information could improve the predictive performance. The constructed models have been demonstrated to effectively classify (73.65% accuracy) an independent data set of human splicing factors. The result of independent testing indicates that in silico identification could be a feasible means of conducting preliminary analyses of splicing factors and significantly reducing the number of potential targets that require further in vivo or in vitro confirmation.
机译:通过结合RNA序列元件和各种与RNA剪接相关的蛋白质(剪接因子),可以实现对mRNA前剪接的调控。尚未完全阐明人的剪接机制,部分原因是尚未详尽地鉴定人的剪接因子。此外,用于鉴定剪接因子的实验方法既费时又费力。尽管已经提出了许多用于鉴定RNA结合蛋白的计算方法,但是到目前为止,还没有集中于鉴定与RNA剪接相关的蛋白的开发。因此,我们有动力设计一种方法,该方法侧重于使用经过实验验证的剪接因子来鉴定人类剪接因子。氨基酸组成的研究表明,剪接因子与非剪接蛋白之间存在显着差异。利用支持向量机(SVM)构建预测模型,五次交叉验证评估表明,用氨基酸组成训练的SVM模型可以提供有希望的准确性(80.22%)。还检查了另一个基本特征,即氨基酸二肽组成,以产生与氨基酸组成相似的预测性能。此外,这项工作表明,进化信息和领域信息的结合可以提高预测性能。已证明构建的模型可以有效地分类(73.65%的准确性)独立的人类剪接因子数据集。独立测试的结果表明,计算机识别可能是进行剪接因子初步分析并显着减少需要进一步体内或体外确认的潜在靶标数量的可行方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号