首页> 外文期刊>Methods: A Companion to Methods in Enzymology >Textmining in support of knowledge discovery for vaccine development.
【24h】

Textmining in support of knowledge discovery for vaccine development.

机译:支持用于疫苗开发的知识发现的文本挖掘。

获取原文
获取原文并翻译 | 示例
       

摘要

Complete genome data of infectious microorganisms permit systematic computational sequence-based predictions and experimental testing of candidate vaccine epitopes. Both, predictions and the interpretation of experiments rely on existing information in the literature which is mostly manually extracted and curated. The growing amount of data and literature information has created a major bottleneck for the interpretation of results and maintenance of curated databases. The lack of suitable free-text information extraction, processing, and reporting tools prompted us to develop a knowledge discovery support system that enhances the understanding of immune response and vaccine development. The current prototype system, Gene expression/epitpopes/protein interaction (GEpi), focuses on molecular functions of HIV-infected T-cells and HIV epitope information, using textmining, and interrelation of biomolecular data from domain-specific databases with MEDLINE abstract-inferred information. Results showed that extraction and processing of molecular interaction, disease associations, and gene ontology-derived functional information generate intuitive knowledge reports that aid the interpretation of host-pathogen interaction. In contrast, epitope (word and sequence) information in MEDLINE abstracts is surprisingly sparse and often lacks necessary context information, such as HLA-restriction. Since the majority of epitope information is found in tables, figures, and legends of full-text articles, its extraction may not require sophisticated natural language processing techniques. Support of vaccine development through textmining requires therefore the timely development of domain-specific extraction rules for full-text articles, and a knowledge model for epitope-related information.
机译:感染微生物的完整基因组数据可对候选疫苗表位进行系统的基于计算序列的预测和实验测试。实验的预测和解释都依赖于文献中的现有信息,这些信息大多是人工提取和整理的。越来越多的数据和文献信息已经成为解释结果和维护数据库的主要瓶颈。缺乏合适的自由文本信息提取,处理和报告工具,促使我们开发了一种知识发现支持系统,可以增强对免疫应答和疫苗开发的理解。当前的原型系统,基因表达/表达/蛋白相互作用(GEpi),使用文本挖掘,以及使用MEDLINE摘要推断的领域特定数据库中的生物分子数据的相互关系,着重研究HIV感染的T细胞的分子功能和HIV表位信息。信息。结果表明,分子相互作用,疾病关联和基因本体论衍生的功能信息的提取和处理产生了直观的知识报告,有助于解释宿主与病原体的相互作用。相反,MEDLINE摘要中的表位(单词和序列)信息令人惊讶地稀疏,并且通常缺少必要的上下文信息,例如HLA限制。由于大多数抗原决定基信息都存在于全文文章的表格,数字和图例中,因此其提取可能不需要复杂的自然语言处理技术。因此,通过文本挖掘来支持疫苗开发需要及时开发针对全文文章的特定领域提取规则,以及针对与表位相关的信息的知识模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号