首页> 外国专利> TEXT CLASSIFICATION DEVICE, TEXT CLASSIFICATION METHOD, AND TEXT CLASSIFICATION PROGRAM

TEXT CLASSIFICATION DEVICE, TEXT CLASSIFICATION METHOD, AND TEXT CLASSIFICATION PROGRAM

机译:文本分类装置,文本分类方法,和文本分类程序

摘要

A text classification device includes an important word extraction portion that extracts important words from analysis target text data, a distributed representation creation portion that creates distributed representations of words from related document data, a keyword candidate creation portion that extracts words near the important words as synonyms in the distributed representations of the words, a clustering portion that clusters the distributed representations of the important words and synonyms and creates a term cluster, and a viewpoint word creation portion that extracts a hypernym that is a word having a generalized concept of a term in the term cluster using a knowledge base in which relationships between terms are accumulated and creates a viewpoint dictionary in which a viewpoint word selected from the hypernyms is set as a headword and the terms included in the term cluster are set as keywords for the headword.
机译:一种文本分类设备,包括从分析目标文本数据中提取重要单词的重要单词提取部分,从相关文档数据中创建单词的分布式表示的分布式表示创建部分,关键词候选创建部分,其将重要词语附近的词语提取为词语的分布式表示中的同义词;聚类部分,其将重要词语和同义词的分布式表示进行聚类,并创建术语聚类,以及视点词创建部分,其使用累积术语之间的关系的知识库来提取作为术语簇中具有术语的广义概念的词的超义词,并创建视点词典,其中从超义词中选择的视点词被设置为首词,且术语簇中包括的术语被设置为词汇的关键字标题词。

著录项

  • 公开/公告号US2022083581A1

    专利类型

  • 公开/公告日2022-03-17

    原文格式PDF

  • 申请/专利权人 HITACHI LTD.;

    申请/专利号US202117203993

  • 发明设计人 YASUHIRO SOGAWA;MISA SATO;KOHSUKE YANAI;

    申请日2021-03-17

  • 分类号G06F16/35;G06F16/33;G06F40/205;G06F40/279;G06F40/242;

  • 国家 US

  • 入库时间 2022-08-24 23:54:20

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号