...
首页> 外文期刊>Nature reviews neuroscience >Pattern recognition analysis on long noncoding RNAs: a tool for prediction in plants
【24h】

Pattern recognition analysis on long noncoding RNAs: a tool for prediction in plants

机译:长度非编码RNA的模式识别分析:植物预测工具

获取原文
获取原文并翻译 | 示例

摘要

Motivation: Long noncoding RNAs (lncRNAs) correspond to a eukaryotic noncoding RNA class that gained great attention in the past years as a higher layer of regulation for gene expression in cells. There is, however, a lack of specific computational approaches to reliably predict lncRNA in plants, which contrast the variety of prediction tools available for mammalian lncRNAs. This distinction is not that obvious, given that biological features and mechanisms generating lncRNAs in the cell are likely different between animals and plants. Considering this, we present a machine learning analysis and a classifier approach called RNAplonc (https://github. com/TatianneNegri/RNAplonc/) to identify lncRNAs in plants. Results: Our feature selection analysis considered 5468 features, and it used only 16 features to robustly identify lncRNA with the REPTree algorithm. That was the base to create the model and train it with lncRNA and mRNA data from five plant species (thale cress, cucumber, soybean, poplar and Asian rice). After an extensive comparison with other tools largely used in plants (CPC, CPC2, CPAT and PLncPRO), we found that RNAplonc produced more reliable lncRNA predictions from plant transcripts with 87.5% of the best result in eight tests in eight species from the GreeNC database and four independent studies in monocotyledonous (Brachypodium) and eudicotyledonous (Populus and Gossypium) species.
机译:动机:长的非致rnas(lncrnas)对应于真核非编码RNA类,在过去几年中越来越受到较高的细胞基因表达的调控层。然而,存在缺乏特定的计算方法来可靠地预测植物中的LNCRNA,其对比哺乳动物LNCRNA的各种预测工具。考虑到在细胞中产生LNCRNA的生物学特征和机制可能是不同的,这种区别并不明显。考虑到这一点,我们提出了一种机器学习分析和称为RNAPLONC的分类器方法(HTTPS:// GitHub。COM / Tatiannenegri / RNAplonc /),用于识别植物中的LNCRNA。结果:我们的特征选择分析考虑了5468个功能,它仅使用16个功能来强化识别REPTree算法的LNCRNA。这是创建模型的基础并用来自五种植物物种的LNCRNA和mRNA数据训练(Thale Cress,Cucumber,大豆,杨树和亚洲米)。在与植物(CPC,CPC2,CPAT和PLNCPRO)的其他工具进行广泛比较后,我们发现RNAPLONC从植物转录物中产生了更可靠的LNCRNA预测,其中八种从Greenc数据库中的8种测试中获得了87.5%的最佳结果和四圈(血清型)和奥迪米典型(杨树和棉花型)物种的四种独立研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号