首页> 外国专利> Method and system of adding punctuation and establishing language model using a punctuation weighting applied to chinese speech recognized text

Method and system of adding punctuation and establishing language model using a punctuation weighting applied to chinese speech recognized text

机译:应用于中文语音识别文本的使用标点符号加权的添加标点符号和建立语言模型的方法和系统

摘要

A method of processing information content based on a Chinese language model is performed at a computer, the method including: identifying a plurality of expressions in the information content extracted from a speech input through speech recognition that is queued to be processed; dividing the expressions into a plurality of characteristic units according to semantic features and predetermined characteristics associated with each characteristic unit, each including a subset of the expressions and the predetermined characteristics at least including a respective integer number of expressions that are included in the characteristic unit; extracting, from the Chinese language model, a plurality of probabilities for punctuation marks associated with each characteristic unit; and in accordance with the probabilities, associating a respective punctuation mark with each characteristic unit included in the information content. The method further comprises adding punctuation marks based on a weight determined for each punctuation mark.
机译:在计算机上执行基于中文模型的信息内容处理方法,该方法包括:识别从通过语音识别输入的语音中提取的信息内容中的多个表达,该多个表达排队等待处理;根据语义特征和与每个特征单元相关联的预定特征将表达式划分为多个特征单元,每个特征单元包括表达式的子集和预定特征,预定特征至少包括特征单元中包括的各个整数个表达式;从中文模型中提取与每个特征单元相关的标点符号的多个概率;根据概率,将各个标点符号与信息内容中包括的每个特征单元相关联。该方法还包括基于为每个标点符号确定的权重添加标点符号。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号