Semi-supervised Learning of Alternatively Spliced Exons Using Co-training

机译：使用联合训练的替代拼接外显子的半监督学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Alternative splicing is a phenomenon that gives rise to multiple mRNA transcripts from a single gene. It is believed that a large number of genes undergoes alternative splicing. Predicting alternative splicing events is a problem of great interest, as it can help the understanding of transcript diversity. Supervised machine learning approaches can be used to predict alternative splicing events at genome level. However, supervised approaches require large amounts of labeled data to learn accurate classifiers. While large amounts of genomic data are produced by the new sequencing technologies, labeling these data can be costly and time consuming. Therefore, semi-supervised learning approaches that can make use of large amounts of unlabeled data, in addition to small amounts of labeled data are highly desirable. In this work, we study the usefulness of a semi-supervised learning approach, co-training, for classifying exons as alternatively spliced or constitutive. The co-training algorithm makes use of two views of the data to iteratively learn two classifiers that can inform each other, at each step, with their best predictions on the unlabeled data. We consider two sets of features for constructing views for the problem of predicting alternatively spliced exons: exonic splicing enhancers and intronic regulatory sequences. We use the Naive Bayes Multinomial algorithm as a base classifier in our study. Experimental results show that the usage of the unlabeled data can result in better classifiers as compared to those obtained from the small amount of labeled data alone.

机译：选择性剪接是从单个基因产生多个mRNA转录物的现象。据信大量基因经历了可变剪接。预测替代剪接事件是一个非常令人感兴趣的问题，因为它可以帮助理解转录本多样性。有监督的机器学习方法可用于预测基因组水平的可变剪接事件。然而，有监督的方法需要大量的标记数据来学习准确的分类器。尽管新的测序技术可产生大量的基因组数据，但标记这些数据可能既昂贵又耗时。因此，非常需要除了少量标记数据之外还可以使用大量未标记数据的半监督学习方法。在这项工作中，我们研究了半监督学习方法（共训练）对于将外显子分类为可剪接的或本构的分类的有用性。协同训练算法利用数据的两个视图来迭代学习两个分类器，这些分类器可以在每个步骤相互告知，它们对未标记数据的最佳预测。我们考虑了两种功能来构建预测选择性剪接外显子的观点：外显子剪接增强子和内含子调控序列。我们在研究中使用朴素贝叶斯多项式算法作为基础分类器。实验结果表明，与仅从少量标记数据获得的分类器相比，未标记数据的使用可产生更好的分类器。

著录项

来源
《2011 IEEE International Conference on Bioinformatics and Biomedicine》|2011年|p.243-246|共4页
会议地点
作者
Tangirala Karthik; Caragea Doina;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理技术;
关键词
alternative splicing; alternatively spliced and constitutive exons; co-training; semi-supervised learning;

机译：选择性剪接;选择性剪接和本构外显子;共同训练;半监督学习;

相似文献

外文文献
中文文献
专利

1. Predicting alternatively spliced exons using semi-supervised learning [J] . Stanescu Ana, Tangirala Karthik, Caragea Doina International journal of data mining and bioinformatics . 2016,第1期

机译：使用半监督学习预测可变剪接外显子
2. Semi-supervised learning combining co-training with active learning [J] . Yihao Zhang, Junhao Wen, Xibin Wang, Expert Systems with Application . 2014,第5期

机译：半监督学习，将联合训练与主动学习相结合
3. Semi-supervised Learning Predicts Approximately One Third of the Alternative Splicing Isoforms as Functional Proteins [J] . Yanqi Hao, Recep Colak, Joan Teyra, Cell Reports . 2015,第2期

机译：半监督学习预测约三分之一的替代剪接异构体作为功能蛋白
4. Semi-supervised Learning of Alternatively Spliced Exons Using Co-training [C] . Tangirala Karthik, Caragea Doina IEEE International Conference on Bioinformatics Biomedicine . 2011

机译：半监督使用共同培训的拼接外显子学习
5. Exosomal proteome changes induced by human tau expression are modulated by the presence of the microtubule-binding domain and alternatively spliced exons 2-3 in SH-SY5Y neuroblastoma cells [D] . Aladwan, Mohammad Mahmoud. 2017

机译：SH-SY5Y神经母细胞瘤细胞中微管结合结构域的存在和其他剪接的外显子2-3的存在可以调节人tau表达诱导的外泌体蛋白质组变化
6. Alternative splicing in the human cytochrome P450IIB6 gene: use of a cryptic exon within intron 3 and splice acceptor site within exon 4. [O] . J S Miles, A W McLaren, F J Gonzalez, 1990

机译：人类细胞色素P450IIB6基因的替代剪接：使用内含子3内的隐性外显子和外显子4内的剪接受体位点。
7. An exonic splicing enhancer in human IGF-I pre-mRNA mediates recognition of alternative exon 5 by the serine-arginine protein splicing factor-2/alternative splicing factor [O] . Smith P. J., Spurrell E. L., Coakley J., 2002

机译：人类IGF-I pre-mRNA中的外显子剪接增强子通过丝氨酸-精氨酸蛋白剪接因子-2 /其他剪接因子介导对备选外显子5的识别

Semi-supervised Learning of Alternatively Spliced Exons Using Co-training

摘要

著录项

相似文献

相关主题

期刊订阅