We describe the problem of mining possibilistic set-valued rules in large relational tables containing categorical attributes taking a finite number of values. An example of such a rule might be "IF HOUSEHOLDSIZE = (Two OR Tree) AND OCCUPATION = (Professional OR Clerical) THEN PAYMENT_METHOD = (CashCheck (Max = 249) OR DebitCard (Max = 175)). The table semantics is supposed to be represented by a frequency distribution, which is interpreted with the help of minimum and maximum operations as a possibility distribution over the corresponding finite multidimensional space. This distribution is approximated by a number of possibilistic prime disjunctions, which represent the strongest patterns. We present an original formal framework generalising the conventional boolean approach on the case of (i) finite-valued variables and (ii) continuos-valued semantics, and propose a new algorithm, called Optimist, for the computationally difficult dual transformation which generates all the strongest prime disjunctions (possibilistic patterns) given a table of data. The algorithm consists of generation, absorption and filtration parts. The generated prime disjunctions can then be used to build rules or for prediction purposes.
展开▼