...
首页> 外文期刊>Plant and cell physiology >Plant Omics Data Center: An Integrated Web Repository for Interspecies Gene Expression Networks with NLP-Based Curation
【24h】

Plant Omics Data Center: An Integrated Web Repository for Interspecies Gene Expression Networks with NLP-Based Curation

机译:Plant Omics数据中心:基于NLP的种间种间基因表达网络的集成Web存储库

获取原文
获取原文并翻译 | 示例

摘要

Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources.
机译:大规模组学资源(如基因组,转录组和代谢组)的全面整合将为分子生物学的更广泛方面提供更深入的见解。为了更好地理解植物生物学,我们旨在构建适用于多种植物物种的下一代测序(NGS)衍生的基因表达网络(GEN)存储库。到目前为止,我们已经从以下8种植物中整合了有关745种高质量mRNA测序(mRNA-Seq)样品的信息:八种植物(拟南芥,水稻,茄子,高粱,双色葡萄,葡萄,茄子,Medi藜和最大大豆)公共短读档案,对整个基因表达谱进行数字化分析,并通过使用对应分析(CA)利用基因表达相似性来绘制GEN。为了了解来自多个物种的GEN的进化重要性,根据物种之间每个节点(基因)的拼写方式将它们联系起来。除其他基因表达信息外,基因的功能注释还将有助于生物学理解。目前,我们正在通过自然语言处理(NLP)技术和手动管理来改善给定的基因注释。在这里,我们介绍了分析的当前状态以及网络数据库PODC(Plant Omics数据中心; http://bioinf.mind.meiji.ac.jp/podc/),该数据库现已向公众开放,提供GEN,功能注释。以及其他全面的组学资源。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号