首页> 外文学位 >A machine learning approach to prediction of RNA editing events.
【24h】

A machine learning approach to prediction of RNA editing events.

机译:一种用于预测RNA编辑事件的机器学习方法。

获取原文
获取原文并翻译 | 示例

摘要

Adenosine-to-inosine (A-to-I) RNA editing is a post-transcriptional process that alters the RNA molecule. It is important to study this process because deficiency or misregulation of A-to-I RNA editing may be the cause of neurological diseases. However, to date the RNA editing machinery is still poorly understood and the number of known recoding editing substrates is still limited. This goal of this thesis is to develop a machine learning approach to prediction of novel editing sites based on a variety of features. The thesis details and implements the Support Vector Machine (SVM) classification algorithm with support for graph and string kernels. The graph kernels enable machine learning from RNA foldback structures -- secondary structures computed by the RNA Editing Dataflow System (REDS). String kernels allow for learning based solely on nucleotide sequence features. Multiple classifiers are designed and evaluated with training data from experimental lab work done at Lehigh University. In addition, due to difficulties of determining a truly negative class (sites that never undergo editing), experiments with the single-class SVM on some of the classifiers were run. Our results indicate that the mismatch kernel [Leslie et al., 2004] classifier generalizes the best out of all classifiers we tested. The mismatch kernel classifier achieved precision rate of 0.88 and sensitivity rate of 0.82 in leave-one-out cross-validation tests. Using this classifier, we suggest new high-confidence RNA editing candidate sites that could be later verified experimentally in the lab.
机译:腺苷到肌苷(A到I)RNA编辑是转录后过程,会改变RNA分子。研究此过程非常重要,因为A-to-I RNA编辑的不足或调节不当可能是神经系统疾病的原因。然而,迄今为止,对RNA编辑机制的了解仍然很少,并且已知的再编码编辑底物的数量仍然受到限制。本文的目的是开发一种基于多种功能的机器学习方法来预测新颖的编辑站点。本文详细介绍并实现了支持图和字符串核的支持向量机(SVM)分类算法。图形内核支持从RNA折返结构(由RNA编辑数据流系统(REDS)计算得出的二级结构)进行机器学习。字符串内核仅允许基于核苷酸序列特征进行学习。使用来自Lehigh University的实验实验室工作的训练数据设计和评估多个分类器。此外,由于难以确定真正的否定类别(从未进行过编辑的站点),因此在某些分类器上进行了单类别SVM的实验。我们的结果表明,不匹配核[Leslie et al。,2004]分类器概括了我们测试的所有分类器中的最佳选择。在留一法交叉验证测试中,不匹配核分类器的准确率达到0.88,灵敏度达到0.82。使用此分类器,我们建议新的高可信度RNA编辑候选位点,稍后可在实验室中通过实验验证。

著录项

  • 作者

    Stoev, Ivan.;

  • 作者单位

    Lehigh University.;

  • 授予单位 Lehigh University.;
  • 学科 Computer Science.
  • 学位 M.S.
  • 年度 2010
  • 页码 81 p.
  • 总页数 81
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号