首页> 外国专利> APPARATUS FOR REFINING PARALLEL CORPUS AND METHOD THEREOF

APPARATUS FOR REFINING PARALLEL CORPUS AND METHOD THEREOF

机译:精炼平行语料库的装置及其方法

摘要

Disclosed are an apparatus for refining a parallel corpus and a method thereof. According to an embodiment of the present invention, the apparatus for refining a parallel corpus comprises: a preprocessing unit for confirming a source parallel corpus from a previously constructed parallel corpus database and converting the source parallel corpus into a sub-word unit; a sentence structure analysis unit for confirming an outlier probability for the source parallel corpus converted into the sub-word unit; a semantic analysis unit for confirming semantic similarity for the source parallel corpus converted into the sub-word unit; and a refined parallel corpus determination unit for determining a refined parallel corpus by combining the outlier probability confirmed in the sentence structure analysis unit and the semantic similarity confirmed in the semantic analysis unit.;COPYRIGHT KIPO 2019
机译:公开了一种用于精炼平行语料库的设备及其方法。根据本发明的一个实施例,一种精炼平行语料库的装置包括:预处理单元,用于从先前构建的平行语料库数据库中确认源平行语料库,并将所述源平行语料库转换为子词单元;句子结构分析单元,用于确认将源平行语料库转换为子词单元的离群概率;语义分析单元,用于确认转换为子词单元的源并行语料库的语义相似性; COPYRIGHT KIPO 2019;以及通过将句子结构分析单元中确认的异常值概率与语义分析单元中确认的语义相似度相结合来确定精细并行语料库的精细平行语料库确定单元。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号