首页> 外国专利> Method for estimating similarity function coefficients from object classification data

Method for estimating similarity function coefficients from object classification data

机译：从对象分类数据估计相似度函数系数的方法

页面导航

摘要
著录项
相似文献

摘要

Given a set of objects {A, B, C, ...}, each described by a set of attribute values, and given a classification of these objects into categories, a similarity function accounts well for this classification only a small number of objects are not correctly classified. This is obtained when coefficients are found for the similarity function which result in an error rate that is considered to be an acceptable level. A method for estimating a coefficient for a similarity function comprises the steps of: (a) selecting an initial value w = wo as the initial coefficient of a similarity function SIMw(a,b); (b) selecting an initial value k = ko defining the number of most-similar objects to use when testing the classification of an object; (c) computing similarity measures by using the current estimate of SIMw for an object S to neighboring objects Gi in the given category and for the object S to neighboring objects Bj outside the given category, and forming a training set of respective data 〈S, SIMw(S,Gi), SIMw(S,Bj)〉 for members in the category and those outside the category; (d) reducing the training set by eliminating all data except for the k-nearest (most similar) objects of Gi and Bj; (e) estimating new coefficients for the similarity function by adjusting w to minimize an error rate measured in terms of the extent to which neighboring objects Bj outside the category have higher similarity measures than neighboring objects Gi within the category; and (f) repeating steps (c), (d), and (e) until the error rate is reduced below a predetermined low level.

机译：给定一组对象{A，B，C，...}，每个对象由一组属性值描述，并给这些对象分类，将类别归类，相似度函数仅占少数对象没有正确分类。当找到相似度函数的系数导致错误率被认为是可接受的水平时，就可以得到这一结果。一种用于估计相似度函数的系数的方法，包括以下步骤：（a）选择初始值w ＝ wo作为相似度函数SIMw（a，b）的初始系数; （b）选择初始值k ＝ ko，该初始值定义测试物体分类时要使用的最相似物体的数量; （c）通过使用对于对象S到给定类别中的相邻对象Gi以及对于对象S到给定类别外的相邻对象Bj的SIMw的当前估计，并形成各个数据的训练集，来计算相似性度量， SIMw（S，Gi），SIMw（S，Bj）〉，用于类别中的成员和类别外的成员; （d）通过消除除Gi和Bj的k个最接近对象（最相似）之外的所有数据来减少训练集; （e）通过调整w以使根据类别外的相邻物体Bj比类别内的相邻物体Gi具有更高的相似性度量的程度所测得的错误率最小化来估计相似性函数的新系数; （f）重复步骤（c），（d）和（e），直到错误率降低到预定的低水平以下。

著录项

公开/公告号EP0513653A2

专利类型
公开/公告日1992-11-19

原文格式PDF
申请/专利权人 SIEMENS AKTIENGESELLSCHAFT;
展开▼

申请/专利号EP19920107661
发明设计人 SCHWANKE ROBERT W.;HANSON STEPHEN J.;
展开▼

申请日1992-05-06
分类号G06F9/44;G06F15/353;G06F15/80;
国家 EP
入库时间 2022-08-22 05:06:57

相似文献

专利
外文文献
中文文献