首页> 美国卫生研究院文献>other >rSeqTU—A Machine-Learning Based R Package for Prediction of Bacterial Transcription Units
【2h】

rSeqTU—A Machine-Learning Based R Package for Prediction of Bacterial Transcription Units

机译:rSeqTU-一种基于机器学习的R包用于细菌转录单位的预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A transcription unit (TU) is composed of one or multiple adjacent genes on the same strand that are co-transcribed in mostly prokaryotes. Accurate identification of TUs is a crucial first step to delineate the transcriptional regulatory networks and elucidate the dynamic regulatory mechanisms encoded in various prokaryotic genomes. Many genomic features, for example, gene intergenic distance, and transcriptomic features including continuous and stable RNA-seq reads count signals, have been collected from a large amount of experimental data and integrated into classification techniques to computationally predict genome-wide TUs. Although some tools and web servers are able to predict TUs based on bacterial RNA-seq data and genome sequences, there is a need to have an improved machine learning prediction approach and a better comprehensive pipeline handling QC, TU prediction, and TU visualization. To enable users to efficiently perform TU identification on their local computers or high-performance clusters and provide a more accurate prediction, we develop an R package, named rSeqTU. rSeqTU uses a random forest algorithm to select essential features describing TUs and then uses support vector machine (SVM) to build TU prediction models. rSeqTU (available at ) has six computational functionalities including read quality control, read mapping, training set generation, random forest-based feature selection, TU prediction, and TU visualization.
机译:转录单位(TU)由同一条链上的一个或多个相邻基因组成,这些基因在大多数原核生物中共转录。准确鉴定TU是描绘转录调控网络并阐明各种原核基因组中编码的动态调控机制的关键第一步。已经从大量实验数据中收集了许多基因组特征,例如基因间距离和包括连续且稳定的RNA-seq读数计数信号在内的转录组特征,并将这些特征整合到分类技术中,以计算预测全基因组TU。尽管某些工具和Web服务器能够基于细菌RNA序列数据和基因组序列来预测TU,但仍需要一种改进的机器学习预测方法和更好的综合管道处理QC,TU预测和TU可视化。为了使用户能够在其本地计算机或高性能群集上有效地执行TU标识并提供更准确的预测,我们开发了一个名为rSeqTU的R包。 rSeqTU使用随机森林算法选择描述TU的基本特征,然后使用支持向量机(SVM)构建TU预测模型。 rSeqTU(位于处)具有六种计算功能,包括读取质量控制,读取映射,训练集生成,基于随机森林的特征选择,TU预测和TU可视化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号