首页> 外文期刊>Bioinformatics >The Merck Gene Index browser: an extensible data integration system for gene finding, gene characterization and EST data mining.
【24h】

The Merck Gene Index browser: an extensible data integration system for gene finding, gene characterization and EST data mining.

机译:默克基因索引浏览器:一种可扩展的数据集成系统,用于基因发现,基因表征和EST数据挖掘。

获取原文
获取原文并翻译 | 示例
           

摘要

MOTIVATION: To make effective use of the vast amounts of expressed sequence tag (EST) sequence data generated by the Merck-sponsored EST project and other similar efforts, sequences must be organized into gene classes, and scientists must be able to 'mine' the gene class data in the context of related genomic data. RESULTS: This paper presents the Merck Gene Index browser, an easily extensible, World Wide Web-based system for mining the Merck Gene Index (MGI) and related genomic data. The MGI is a non-redundant set of clones and sequences, each representing a distinct gene, constructed from all high-quality 3' EST sequences generated by the Merck-sponsored EST project. The MGI browser integrates data from a variety of sources and storage formats, both local and remote, using an eclectic integration strategy, including a federation of relational databases, a local data warehouse and simple hypertext links. Data currently integrated include: LENS cDNA clone and EST data, dbEST protein and non-EST nucleic acid similarity data, WashU sequence chromatograms. Entrez sequence and Medline entries, and UniGene gene clusters. Flatfile sequence data are accessed using the Bioapps server, an internally developed client-server system that supports generic sequence analysis applications. Browser data are retrieved and formatted by means of the Bioinformatics Data Integration Toolkit (B-DIT), a new suite of Perl routines.
机译:动机:为了有效利用由默克公司(Merck)赞助的EST项目产生的大量表达序列标签(EST)序列数据,以及其他类似的努力,必须将序列组织成基因类别,科学家们必须能够“挖掘”基因相关基因组数据中的基因类数据。结果:本文介绍了默克基因索引浏览器,这是一种易于扩展的,基于万维网的系统,用于挖掘默克基因索引(MGI)和相关基因组数据。 MGI是一组非冗余的克隆和序列,每个克隆和序列都代表一个不同的基因,这些序列是由默克公司赞助的EST项目生成的所有高质量3'EST序列构成的。 MGI浏览器使用折衷的集成策略集成了来自本地和远程的各种来源和存储格式的数据,包括关系数据库的联合,本地数据仓库和简单的超文本链接。当前整合的数据包括:LENS cDNA克隆和EST数据,dbEST蛋白和非EST核酸相似性数据,WashU序列色谱图。 Entrez序列和Medline条目,以及UniGene基因簇。平面文件序列数据可使用Bioapps服务器访问,该服务器是内部开发的客户端-服务器系统,支持通用序列分析应用程序。浏览器数据通过新的Perl例程套件“生物信息学数据集成工具包”(B-DIT)进行检索和格式化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号