首页> 外国专利> preparation and evaluation of the utility of multiple features based classification system using genetic algorithms

preparation and evaluation of the utility of multiple features based classification system using genetic algorithms

机译:遗传算法的多特征分类系统实用性评估与实用性

摘要

The features that are presented to an evolutionary algorithm are preprocessed to generate combination features that may be more efficient in distinguishing among classifications than the individual features that comprise the combination feature. An initial set of features is defined that includes a large number of potential features, including the generated features that are combinations of other features. These features include, for example, all of the words used in a collection of content material that has been previously classified, as well as combination features based on these features, such as all the noun and verb phrases used. This pool of original features and combination features are provided to an evolutionary algorithm for a subsequent evaluation, generation, and determination of the best subset of features to use for classification. In this evaluation and generation process, each combination feature is processed as an independent feature, independent of the features that were used, or not used, to form the combination feature. In this manner, for example, a particular phrase that is generated as a combination of original feature words may be determined to be a better distinguishing feature than any of the original feature words and a more efficient distinguishing feature than an unrelated selection of the individual feature words, as might be provided by a conventional evolutionary algorithm. The resultant best performing subset is subsequently used to characterize new content material for automated classification. If the automated classification includes a learning system, the evolutionary algorithm and the generated combination features are also used to train the learning system.
机译:对呈现给进化算法的特征进行预处理,以生成组合特征,与包含组合特征的单个特征相比,这些组合特征在区分各个类别时可能更有效。定义了一组初始特征,其中包括大量潜在特征,其中包括生成的特征,这些特征是其他特征的组合。这些功能包括,例如,先前已分类的内容材料集合中使用的所有单词,以及基于这些功能的组合功能,例如使用的所有名词和动词短语。原始特征和组合特征的集合将提供给进化算法,用于后续评估,生成和确定用于分类的最佳特征子集。在此评估和生成过程中,每个组合特征都作为独立的特征进行处理,而与使用或不使用以形成组合特征的特征无关。以这种方式,例如,可以确定作为原始特征词的组合而生成的特定短语是比任何原始特征词更好的区别特征,并且比对单个特征的不相关选择更有效的区别特征常规进化算法可能提供的词。随后将得到的表现最佳的子集用于表征新的内容材料,以进行自动分类。如果自动分类包括学习系统,则进化算法和生成的组合特征也将用于训练学习系统。

著录项

  • 公开/公告号DE60128405D1

    专利类型

  • 公开/公告日2007-06-21

    原文格式PDF

  • 申请/专利权人 KONINKLIJKE PHILIPS ELECTRONICS N.V.;

    申请/专利号DE20016028405T

  • 发明设计人 SCHAFFER JAMES D.;

    申请日2001-01-11

  • 分类号G06F17/30;G06K9/62;G06N3/12;

  • 国家 DE

  • 入库时间 2022-08-21 20:28:13

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号