首页> 外文期刊>BMC Genomics >PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains
【24h】

PhenoLink - a web-tool for linking phenotype to ~omics data for bacteria: application to gene-trait matching for Lactobacillus plantarum strains

机译:PhenoLink-一种将表型与细菌的组学数据相关联的网络工具:在植物乳杆菌菌株的基因特征匹配中的应用

获取原文
           

摘要

Background Linking phenotypes to high-throughput molecular biology information generated by ~omics technologies allows revealing cellular mechanisms underlying an organism's phenotype. ~Omics datasets are often very large and noisy with many features (e.g., genes, metabolite abundances). Thus, associating phenotypes to ~omics data requires an approach that is robust to noise and can handle large and diverse data sets. Results We developed a web-tool PhenoLink ( http://bamics2.cmbi.ru.nl/websoftware/phenolink/ webcite ) that links phenotype to ~omics data sets using well-established as well new techniques. PhenoLink imputes missing values and preprocesses input data (i) to decrease inherent noise in the data and (ii) to counterbalance pitfalls of the Random Forest algorithm, on which feature (e.g., gene) selection is based. Preprocessed data is used in feature (e.g., gene) selection to identify relations to phenotypes. We applied PhenoLink to identify gene-phenotype relations based on the presence/absence of 2847 genes in 42 Lactobacillus plantarum strains and phenotypic measurements of these strains in several experimental conditions, including growth on sugars and nitrogen-dioxide production. Genes were ranked based on their importance (predictive value) to correctly predict the phenotype of a given strain. In addition to known gene to phenotype relations we also found novel relations. Conclusions PhenoLink is an easily accessible web-tool to facilitate identifying relations from large and often noisy phenotype and ~omics datasets. Visualization of links to phenotypes offered in PhenoLink allows prioritizing links, finding relations between features, finding relations between phenotypes, and identifying outliers in phenotype data. PhenoLink can be used to uncover phenotype links to a multitude of ~omics data, e.g., gene presence/absence (determined by e.g.: CGH or next-generation sequencing), gene expression (determined by e.g.: microarrays or RNA-seq), or metabolite abundance (determined by e.g.: GC-MS).
机译:背景将表型与由组学技术产生的高通量分子生物学信息联系起来,可以揭示生物表型的潜在细胞机制。 〜组学数据集通常非常大且嘈杂,具有许多特征(例如基因,代谢产物丰度)。因此,将表型与组学数据相关联需要一种对噪声鲁棒并且可以处理大量不同数据集的方法。结果我们开发了一个网络工具PhenoLink(http://bamics2.cmbi.ru.nl/websoftware/phenolink/ webcite),该工具使用成熟的新技术将表型与〜omics数据集相关联。 PhenoLink会估算缺失值并预处理输入数据(i)以减少数据中的固有噪声,以及(ii)抵消随机森林算法的缺陷,该算法基于特征(例如基因)选择。在特征(例如基因)选择中使用预处理的数据来识别与表型的关系。我们应用PhenoLink基于42种植物乳杆菌菌株中2847个基因的存在/不存在以及这些菌株在糖的生长和二氧化氮生产中的表型测量,来确定基因与表型的关系。根据基因的重要性(预测值)对基因进行排序,以正确预测给定菌株的表型。除了已知的基因与表型的关系,我们还发现了新颖的关系。结论PhenoLink是一个易于访问的网络工具,可帮助从大型且经常有噪声的表型和组学数据集中识别关系。通过可视化PhenoLink中提供的表型链接,可以对链接进行优先级排序,查找特征之间的关系,查找表型之间的关系以及识别表型数据中的异常值。 PhenoLink可用于揭示与众多组学数据的表型链接,例如基因存在/不存在(例如,由CGH或下一代测序确定),基因表达(例如,由微阵列或RNA序列确定),或代谢物丰度(例如,由GC-MS确定)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号