首页> 外文会议>Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on >A Chinese unsupervised word sense disambiguation method based on semantic vector
【24h】

A Chinese unsupervised word sense disambiguation method based on semantic vector

机译:基于语义向量的中文无监督词义消歧方法

获取原文
获取原文并翻译 | 示例

摘要

The supervise machine learning word sense disambiguation method need to annotate the words of the training corpus, in order to overcome the data sparseness problem to achieve the good word sense disambiguation effect we must establish a large-scale marked Corpus, but obtaining the marked corpus requires high artificial price. Against this problem this paper proposes an unsupervised learning method without manual annotation. Firstly we mine the feature words based on PMI (Point-wise Mutual Information) and Z test, defining the v words to describe a certain sense of polysemy, and then calculating the similarity between sense words and the features of polysemy in the context to determine the correct sense of the polysemy. This paper disambiguates ten typical polysemy, and experimental results prove that the method is effective.
机译:监督式机器学习词义消歧方法需要对训练语料库的词进行注释,为了克服数据稀疏性问题,达到良好的词义消歧效果,我们必须建立一个大规模的标记语料库,但要获得标记语料库则需要高人工价格。针对此问题,本文提出了一种无需人工注释的无监督学习方法。首先我们基于PMI(逐点互信息)和Z检验挖掘特征词,定义v词来描述某种多义性,然后计算上下文中有义词与多义性特征之间的相似度以确定一词多义的正确意义。本文消除了十种典型的多义性歧义,实验结果证明该方法是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号