首页>
外国专利>
METHOD AND APPARATUS FOR SCALABLE PROBABILISTIC CLUSTERING USING DECISION TREES
METHOD AND APPARATUS FOR SCALABLE PROBABILISTIC CLUSTERING USING DECISION TREES
展开▼
机译:决策树的可扩展概率聚类方法和装置
展开▼
页面导航
摘要
著录项
相似文献
摘要
Some embodiments of the invention include methods for identifying clusters in a database, data warehouse or data mart. The identified clusters can be meaningfully understood by a list of the attributes and corresponding values for each of the clusters. Some embodiments of the invention include a method for scalable probabilistic clustering using a decision tree. Some embodiments of the invention, perform linearly in the size of the set of data and only require a single access to the set of data. Some embodiments of the invention produce interpretable clusters that can be described in terms of a set of attributes and attribute values for that set of attributes. In some embodiments, the cluster can be interpreted by reading the attribute values and attributes on the path from the root node of the decision tree to the node of the decision tree corresponding to the cluster. In some embodiments, it is not necessary for there to be a domain specific distance function for the attributes. In some embodiments, a cluster is determined by identifying an attribute with the highest influence on the distribution of the other attributes. Each of the values assumed by the identified attribute corresponds to a cluster, and a node in the decision tree. In some embodiments, the CUBE operation is used to access the set of data a single time and the result is used to computer the influence and other calculations.
展开▼