Analyzing Quantitative Databases: Image is Everything

机译：分析定量数据库：图像就是一切

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Traditional statistical methods deal with corroborating given hypotheses on a given body of data. However, generating the hypothesis itself is a matter of intuition and ingenuity. It is clearly impossible to test all hypotheses on a database with millions of records and hundreds of fields. There have been attempts to bridge this gap through data mining. Association genera-tion is a method of creating such statisti-cal hypotheses for binary data. For quantitative databases the situation is still not good. There are a number of known meth-ods. One is a reduction to binary data by creating intervals and then generating associations. This method is computationally ex-pensive. Another suggested method was by generating associations that are statistically interesting. This method also was tried only on small databases and is applicable only for binary relations, e.g., in certain ranges of field X, field Y lies significantly outside its average. We suggest a method that answers some of the problems with the current techniques. Our idea is based on using visualization techniques and image processing ideas to rank subsets of fields according to the relation between them in the database. This ranking suggests the hypotheses to be statistically investigated. Our method has the following advantages: 1. It is scalable. Our algorithm is mainly based on analyzing histograms of the data set, thus is more efficient. It is also naturally suitable for sampling. 2. It is generalizable in the size of the set of fields. No current method handles more than a binary relation. 3. It affords comparability between fields over different base sets. This allows a uniform scale for different sets of fields in different databases. In this paper we present an algorithmic methodology and the results of its application to the census bureau data bases, cpsm93p and nhis93ac.

机译：传统的统计方法涉及在给定的数据主体上证实给定的假设。然而，产生假设本身是直觉和巧思的问题。显然不可能在具有数百万条记录和数百个字段的数据库上检验所有假设。已经尝试通过数据挖掘来弥合这种差距。关联生成是一种为二进制数据创建这样的统计假设的方法。对于定量数据库，情况仍然不好。有许多已知的方法。一种是通过创建间隔然后生成关联来减少二进制数据。此方法在计算上比较昂贵。另一种建议的方法是通过生成具有统计意义的关联。该方法也仅在小型数据库上尝试过，并且仅适用于二进制关系，例如，在字段X的某些范围内，字段Y明显超出其平均值。我们建议一种方法来解决当前技术中的一些问题。我们的想法基于使用可视化技术和图像处理想法，根据字段在数据库中的关系对字段的子集进行排名。该排名表明该假设需要进行统计调查。我们的方法具有以下优点：1.可扩展。我们的算法主要基于分析数据集的直方图，因此效率更高。它自然也适合采样。 2.可以根据字段集的大小进行概括。当前方法没有比二进制关系更多的处理方法。 3.它提供了不同基础集上各个字段之间的可比性。这允许对不同数据库中的不同字段集进行统一缩放。在本文中，我们介绍了一种算法方法及其在人口普查局数据库cpsm93p和nhis93ac中的应用结果。

著录项

来源
《Twenty-Seventh International Conference on Very Large Data Bases, 27th, Sep 11-14th, 2001, Roma, Italy》|2001年|p.89-98|共10页
会议地点 Roma(IT);Roma(IT)
作者
Amihood Amir; Reuven Kashi; Nathan S. Netanyahu;
展开▼
作者单位

Department of Mathematics and Computer Science, Bar-Ilan University, 52900 Ramat-Gan, Israel;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Analyzing Users' Retrieval Behaviours and Image Queries of a Photojournalism Image Database [J] . Hsin-liang Chen, Thomas Kochtanek, Christopher Sean Burns, The Canadian Journal of Information and Library Science . 2010,第3期

机译：分析新闻摄影图像数据库的用户检索行为和图像查询
2. qPCR-DAMS: a database tool to analyze, manage, and store both relative and absolute quantitative real-time PCR data [J] . Jin N, He KY, Liu L Physiological genomics . 2006,第0期

机译：qPCR-DAMS：分析，管理和存储相对和绝对定量实时PCR数据的数据库工具
3. DryMass: handling and analyzing quantitative phase microscopy images of spherical, cell-sized objects [J] . Paul Müller, Gheorghe Cojoc, Jochen Guck BMC Bioinformatics . 2020,第1期

机译：DRYMASS：处理和分析球形，细胞尺寸对象的定量相显微镜图像
4. Analyzing Quantitative Databases: Image is Everything [C] . Nathan S. Netanyahu, Reuven Kashi, Amihood Amir International conference on very large data bases . 2001

机译：分析定量数据库：图像就是一切
5. The database implementation and algorithm design of qPCR-DAMS: A database tool to analyze, manage, and store quantitative real-time PCR data [D] . He, Keyu 2007

机译：qPCR-DAMS的数据库实现和算法设计：一种用于分析，管理和存储实时定量PCR数据的数据库工具
6. DryMass: handling and analyzing quantitative phase microscopy images of spherical cell-sized objects [O] . Paul Müller, Gheorghe Cojoc, Jochen Guck 2020

机译：DryMass：处理和分析球形细胞大小物体的定量相显微镜图像
7. Applications of satellite images and field databases to analyze agroforestry systems in Brazil [O] . Édson Luis Bolfe 2020

机译：卫星图像和现场数据库的应用在巴西分析农林园艺系统
8. A new generation of intelligent trainable tools for analyzing large scientific image databases [R] . Fayyad, Usama M., Smyth, Padhraic, Atkinson, David J. 1994

机译：用于分析大型科学图像数据库的新一代智能可训练工具

Analyzing Quantitative Databases: Image is Everything

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅