...
首页> 外文期刊>Journal of Intelligent Information Systems >Use and Maintenance of Histograms for Large Scientific Database Access Planning: A Case Study of a Pharmaceutical Data Repository
【24h】

Use and Maintenance of Histograms for Large Scientific Database Access Planning: A Case Study of a Pharmaceutical Data Repository

机译:直方图在大型科学数据库访问规划中的使用和维护:以药品数据存储库为例

获取原文
获取原文并翻译 | 示例

摘要

Scientific databases, and in particular chemical and biological databases, have reached massive sizes in recent years due to the improvement of bench-side high throughput screening tools used by scientists. This rapid increase has caused a shift in the bottleneck in discovery and product development from the bench side to the computational side, thus, creating a need for new computational tools that can facilitate the access and interpretation of such massive data. This paper discusses the design and implementation of the computation of a histogram to speed up access to large pharmaceutical databases. As opposed to traditional histograms in which approximate value distributions is obtained by grouping attribute values into buckets, the computation histogram proposed in this paper records the retrieval time and the calculation time of descriptors in a pharmaceutical drug candidate database. Both on-line and off-line update techniques are proposed to update the computation histogram so that an efficient query plan can be generated. The efficiency of the proposed computation histogram is demonstrated by using a drug candidate database which is used in the pharmaceutical drug discovery process. The histogram allows the result of a query to be either computed using a computational algorithm or retrieved from the database. In addition to the pharmaceutical drug candidate database, the proposed approach is applicable to other scientific databases such as biological and agroscience databases.
机译:近年来,由于科学家使用的台式高通量筛选工具的改进,科学数据库,尤其是化学和生物学数据库已达到海量规模。这种快速增长已导致发现和产品开发的瓶颈从基准侧转移到计算侧,因此,需要新的计算工具,以利于访问和解释此类海量数据。本文讨论了直方图计算的设计和实现,以加快对大型药品数据库的访问。与传统的直方图不同,传统直方图是通过将属性值分组到存储桶中来获得近似值分布的,而本文提出的计算直方图则将候选药物的检索时间和描述符的计算时间记录在候选药物数据库中。提出了在线和离线更新技术来更新计算直方图,从而可以生成有效的查询计划。通过使用在药物发现过程中使用的候选药物数据库,可以证明所建议的计算直方图的效率。直方图允许使用计算算法计算查询结果或从数据库中检索查询结果。除了候选药物数据库之外,所提出的方法还适用于其他科学数据库,例如生物学和农业科学数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号