首页> 外文期刊>Computer speech and language >Prediction of Chinese word-formation patterns using the layer-weighted semantic graph-based KFP-MCO classifier
【24h】

Prediction of Chinese word-formation patterns using the layer-weighted semantic graph-based KFP-MCO classifier

机译:基于分层加权语义图的KFP-MCO分类器预测中文构词模式

获取原文
获取原文并翻译 | 示例

摘要

Nowadays natural language processing plays an important and critical role in the domain of intelligent computing, pattern recognition, semantic analysis and machine intelligence. For Chinese information processing, to construct the predictive models of different semantic word-formation patterns with a large-scale corpus can significantly improve the efficiency and accuracy of the paraphrase of the unregistered or new word, ambiguities elimination, automatic lexicography, machine translation and other applications. Therefore it is required to find the relationship between word-formation patterns and different influential factors, which can be denoted as a classification problem. However, due to noise, anomalies, imprecision, polysemy, ambiguity, nonlinear structure, and class-imbalance in semantic word-formation data, multi-criteria optimization classifier (MCOC), support vector machines (SVM) and other traditional classification approaches will give the poor predictive performance. In this paper, according to the characteristic analysis of Chinese word-formations, we firstly proposed a novel layered semantic graph of each disyllabic word, the layer-weighted graph edit distance (GED) and its similarity kernel embedded into a new vector space, then on the normalized data MCOC with kernel, fuzzification and penalty factors (KFP-MCOC) and SVM are employed to predict Chinese semantic word-formation patterns. Our experimental results and comparison with SVM show that KFP-MCOC based on the layer-weighted semantic graphs can increase the separation of different patterns, the predictive accuracy of target patterns and the generalization of semantic pattern classification on new compound words.
机译:如今,自然语言处理在智能计算,模式识别,语义分析和机器智能领域中扮演着重要的角色。对于中文信息处理,使用大规模语料库构建不同语义构词模式的预测模型可以显着提高未注册或新词复述的效率和准确性,歧义消除,自动词典,机器翻译等应用程序。因此,需要找到单词形成模式与不同影响因素之间的关系,这可以称为分类问题。然而,由于噪声,异常,不精确,多义性,歧义性,非线性结构和语义词形成数据中的类不平衡,多标准优化分类器(MCOC),支持向量机(SVM)和其他传统分类方法将给出不良的预测性能。本文根据汉字构形的特征分析,首先提出了每个双音节词的新型分层语义图,将分层加权图编辑距离(GED)及其相似性内核嵌入到新的向量空间中,然后在具有核,归一化和惩罚因子(KFP-MCOC)和支持向量机的标准化数据MCOC的基础上,预测了中文语义构词模式。我们的实验结果和与SVM的比较表明,基于层加权语义图的KFP-MCOC可以增加不同模式的分离,目标模式的预测准确性以及对新复合词的语义模式分类的推广。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号