首页>
外国专利>
A SCALABLE SYSTEM FOR CLUSTERING OF LARGE DATABASES HAVING MIXED DATA ATTRIBUTES
A SCALABLE SYSTEM FOR CLUSTERING OF LARGE DATABASES HAVING MIXED DATA ATTRIBUTES
展开▼
机译:具有混合数据属性的大型数据库的可伸缩系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
In one exemplary embodiment, the present invention provides a kind of data digging system, project in database or any other data storage medium for searching aggregate of data. The data evaluation select to be probed by M model before starting, and M model in the cluster of the number (potassium) of cluster. Data of the cluster for classifying in the database are divided into each model in K different clusters. Each model of the initial estimation for data distribution is provided to probe into. It rapid access memory that data in so a part of database are read from storage medium and whose is incorporated determines the size of buffer by user or operating system according to available memory resource. For updating each K cluster of primary model data distribution comprehensively M models in the data buffer for the data for being included. Some data belong to a group and summarize or compress and be stored as restoring the data that the data represent sufficient statistic. More data, access are updated from database with model. The undated parameter collection cluster is determined from the data for summarizing the sufficient statistics () of data with newly obtaining. If stopping criteria is evaluated to determine more data needs and reads from database.
展开▼