首页> 外国专利> TEXT DATA STUDY ANALYSIS SYSTEM, TEXT DATA STUDY DEVICE, TEXT DATA ANALYSIS DEVICE, ITS METHOD AND PROGRAM

TEXT DATA STUDY ANALYSIS SYSTEM, TEXT DATA STUDY DEVICE, TEXT DATA ANALYSIS DEVICE, ITS METHOD AND PROGRAM

机译:文本数据研究分析系统,文本数据研究装置,文本数据分析装置,其方法和程序

摘要

PROBLEM TO BE SOLVED: To provide a text data study analysis system for simply analyzing text data without using a dictionary depending on an intended text or dividing the text data for every content of the text; and to provide a text data study device, a text data analysis device, its method and its program.;SOLUTION: An extraction means 102 extracts a plurality of features for characterizing study data from the study data; a generation means 105 generates vectors showing whether the respective features are included in the respective text data or not; a division means 104 divides the vectors into belonging vectors each belonging to a certain class and non-belonging vectors without belonging to it on the basis of class included in the study data; a computation means 106 computes a model for determining whether an arbitrary vector is a belonging vector or not on a class basis based on the belonging vector; a presumption means 108 presumes a class matching to the content of the text data corresponding to each vector applied to each model of a plurality of models; a calculation means 109 calculates frequencies at which several features selected from the plurality of features appear in evaluation data on a class basis; and the feature related to each class is selected based on the frequency for every class.;COPYRIGHT: (C)2006,JPO&NCIPI
机译:要解决的问题:提供一种文本数据研究分析系统,该系统可以简单地分析文本数据,而无需根据预期的文本使用字典或将文本数据划分为文本的每个内容;解决方案:提取装置102从学习数据中提取用于表征学习数据的多个特征;以及提供文本数据学习设备,文本数据分析设备,其方法和程序。生成装置105生成表示各个特征是否被包括在各个文本数据中的向量;分割单元104根据研究数据所包含的类别,将矢量分为属于某类别的所属向量和不属于该类别的非属于向量。计算装置106基于所属向量,计算用于确定任意矢量是否是所属向量的模型。推定装置108推定与文本数据的内容匹配的类别,该文本数据的内容对应于应用于多个模型中的每个模型的每个矢量。计算装置109计算从多个特征中选择的几个特征分类出现在评价数据中的频率。并根据每个类别的频率选择与每个类别相关的功能。;版权:(C)2006,JPO&NCIPI

著录项

  • 公开/公告号JP2006085634A

    专利类型

  • 公开/公告日2006-03-30

    原文格式PDF

  • 申请/专利权人 TOSHIBA CORP;

    申请/专利号JP20040272377

  • 发明设计人 SAKURAI SHIGEAKI;

    申请日2004-09-17

  • 分类号G06F17/30;

  • 国家 JP

  • 入库时间 2022-08-21 21:53:16

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号