首页> 中文期刊> 《计算机应用与软件》 >泛化类型的机读词典属性信息抽取

泛化类型的机读词典属性信息抽取

         

摘要

为了构建实体关系网络、改进和完善基于概念的信息检索,提出一种不针对特定属性类型的从机读词典中抽取概念实例的属性值信息的方法.首先,通过手工标注和遴选等方式生成初始实体一属性值对集并抽取出粗糙模式实例集;其次,经过对模式实例集的聚类合并和扩充处理得到若干组的模式实例,每一组代表一个属性类型;最后.从词典中抽取出新实体词汇的属性值信息.在模式实例集的处理中引入了同义词扩展和词汇语义相似度计算以提高模式实例的覆盖率.实验中针对中的电子领域词汇进行抽取,取得了较好的效果.%This paper presents a method to acquire the attribute value information of conceptual instances from machine-readable dictionary in light to generic attribute types in order to build the network of entity-relationships and to improve and perfect the conceptual-based information retrieval. First, the method generates preliminary entity-attribute value pair sets by means of manual marking and selecting and acquires rough pattern instances set. Secondly, the method obtains several groups of pattern instances by clustering, merging and expanding the pattern instances set, each group represents a type of attribute. Finally, the method acquires the attribute value information of new entity vocabulary from dictionary. When processing pattern instances set the semantic similarity of the vocabulary and synonym extension are introduced to improve the coverage of pattern instances. In experiment the extraction aiming at the vocabulary in electronic field is conducted from the Standard Dictionary of Modern Chinese and the result is good.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号