首页> 外文会议>2016 IEEE International Conference on Knowledge Engineering and Applications >Knowledge extraction through etymological networks: Synonym discovery in Sino-Korean words
【24h】

Knowledge extraction through etymological networks: Synonym discovery in Sino-Korean words

机译:通过词源网络的知识提取:中韩单词的同义词发现

获取原文
获取原文并翻译 | 示例

摘要

Extracting knowledge from a text is a very active area of research. Techniques such as word embedding and LSA have brought great breakthroughs and have been used in applications such as automatic translation. We propose a novel approach to extract knowledge from text that relies on a graph to express the complex etymological structures formed by the historical roots of words. Our approach is specially fit for the study of Sino-Korean vo- cabulary, where the etymological roots of words are clearly shown in their writing. We use our approach to build a bipartite graph based on the Chinese etymological roots of Sino-Korean words, and then use the network structure to extract features describing pairs of nodes. We used these features in a classification scheme to discover pairs of nodes that represent synonym characters. Our model is simpler than previous work on synonym discovery with Chinese characters, and obtains good results. The code and data for our work are made openly available.
机译:从文本中提取知识是一个非常活跃的研究领域。词嵌入和LSA等技术带来了重大突破,并已在自动翻译等应用中使用。我们提出了一种新颖的方法来提取文本中的知识,该方法依赖于图形来表达由词的历史根源形成的复杂词源结构。我们的方法特别适合于中韩语音研究,在单词的词源上可以清楚地看到它们的词根。我们使用我们的方法基于汉韩单词的汉语词源建立二部图,然后使用网络结构提取描述结点对的特征。我们在分类方案中使用了这些功能,以发现代表同义词字符的节点对。我们的模型比以前有关汉字同义词发现的工作更简单,并且取得了良好的效果。我们工作的代码和数据公开可用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号