首页> 外国专利> Word / collocation classification processing method, collocation extraction method, word / collocation classification processing device, speech recognition device, machine translation device, collocation extraction device, and word / collocation storage medium

Word / collocation classification processing method, collocation extraction method, word / collocation classification processing device, speech recognition device, machine translation device, collocation extraction device, and word / collocation storage medium

机译:单词/搭配分类处理方法,搭配提取方法,单词/搭配分类处理设备,语音识别设备,机器翻译设备,搭配提取设备和单词/搭配存储介质

摘要

PROBLEM TO BE SOLVED: To make speech recognition and machine translation accurate by classifying words and compound words included in text together and generating a class wherein the words and compound word are mixed. SOLUTION: The word and compound word classifying processor consists of a word classifying means 1, a word class string generating means 2, a word class string extracting means 3, a token giving means 4, a word and token string generating means 5, a word and token classifying means 6, and a compound word substituting means 7. Word classes obtained by classifying words are mapped in a linear array of words of the text data to generate a linear array of word classes. In the linear array of the word classes of the text data, word class arrays which all have adherence above a specific value between adjacent word classes are extracted and tokens are given to the word class arrays. The words and tokens are classified together and then a word class array corresponding to a token is substituted by a coupla belonging to the word string. Namely, a classifying process can be performed automatically without discriminating between words and compound words.
机译:解决的问题:通过将文本中包含的单词和复合词一起分类,并生成将单词和复合词混合在一起的类,可以使语音识别和机器翻译准确。解决方案:单词和复合单词分类处理器包括单词分类装置1,单词类别串生成装置2,单词类别串提取装置3,令牌提供装置4,单词和令牌串生成装置5,单词令牌分类装置6,令牌分类装置6和复合词替代装置7。通过对词进行分类而获得的词类别被映射到文本数据的词的线性阵列中以生成词类别的线性阵列。在文本数据的单词类别的线性阵列中,提取所有在相邻单词类别之间具有特定值以上的依从性的单词类别阵列,并将令牌赋予单词类别阵列。将单词和令牌分类在一起,然后将与令牌相对应的单词类数组替换为属于单词字符串的coupla。即,可以在不区分单词和复合单词的情况下自动执行分类处理。

著录项

  • 公开/公告号JP3875357B2

    专利类型

  • 公开/公告日2007-01-31

    原文格式PDF

  • 申请/专利权人 富士通株式会社;

    申请/专利号JP19970167243

  • 发明设计人 潮田 明;

    申请日1997-06-24

  • 分类号G10L15/06;G10L15/18;G06F17/28;

  • 国家 JP

  • 入库时间 2022-08-21 21:07:35

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号