首页> 外国专利> APPARATUS AND METHOD FOR GENERATING PSEUDOMORPHEME-BASED SPEECH RECOGNITION UNITS BY UNSUPERVISED SEGMENTATION AND MERGING

APPARATUS AND METHOD FOR GENERATING PSEUDOMORPHEME-BASED SPEECH RECOGNITION UNITS BY UNSUPERVISED SEGMENTATION AND MERGING

机译:通过非监督的分段和合并生成基于伪音的语音识别单元的装置和方法

摘要

An apparatus for generating pseudomorpheme voice recognition units of the present invention includes: a sub-morpheme dividing unit which extracts smaller morpheme units from an input syntactic word unit group; and a sub-morpheme merge unit which generates pseudomorpheme units by pairing sub-morphemes with the highest frequency from the sub-morpheme units. According to the present invention, the pseudomorpheme units are generated by unsupervised segmentation and merge; a rate of words other than vocabulary is reduced by segmenting unregistered words such as a proper noun, a loan word, a compound word, which are hard to analyze with an existing morpheme analyzer, by an unsupervised method; and recognition performance can be increased by merging words with high frequencies.
机译:本发明的用于产生伪语素语音识别单元的设备包括:子语素划分单元,其从输入的句法词单元组中提取较小的语素单元。子词素合并单元,通过将子词素单元中频率最高的子词素配对来生成伪词素单元。根据本发明,伪语素单元是通过无监督的分割和合并而产生的。通过无监督的方法分割那些用现有的词素分析器难以分析的未注册词,例如专有名词,借用词,复合词,可以降低词汇量以外的词的比率;通过合并高频词可以提高识别性能。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号