The tuples in a generalized relation (i.e.,a summary generated from a database) are unique,and therefore,can be considered to be a population with a structure that can be described by some probability distribution.In this paper,we present and empirically compare sixteen heuristic measures that evaluate the structure of a summary to assign a single real-valued index that represents its interestingness relative to other summaries generated from the same database.The heuristics are based upon well-known measures of diversity,dispersion,dominance,and inequality used in everal areas of the physical,social,ecological,management,information,and computer sciences.Their use for ranking summaries generated from databases is a new application area.All sixteen heuristics rank less ocmplex summariex (i.e.,those with few tuples and/or few non-ANY attributes) as most interesting.We demonstrate that for sample data sets,the order in which some of the measures rank summaries is highly correlated.
展开▼