首页> 外文会议>International conference on Asian language processing >Embedding wikipedia title based on its wikipedia text and categories
【24h】

Embedding wikipedia title based on its wikipedia text and categories

机译:根据维基百科标题和类别嵌入维基百科标题

获取原文

摘要

Distributed word representation is widely used in many NLP tasks and knowledge-based resources also provide valuable information. Comparing to conventional knowledge bases, Wikipedia provides semi-structural data other than structural data. We argue that a Wikipedia title's categories can help complement the title's meaning besides Wikipedia text, so the categories should be utilized to improve the title's embedding. We propose two directions of using categories, cooperating with conventional context-based approaches, to generate embeddings of Wikipedia titles. We conduct extensively large scale experiments on the generated title embeddings on Chinese Wikipedia. Experiments on word similarity task and analogical reasoning task show that our approaches significantly outperform conventional context-based approaches.
机译:分布式单词表示已广泛用于许多NLP任务中,基于知识的资源也提供了有价值的信息。与常规知识库相比,维基百科提供了除结构数据以外的半结构数据。我们认为,Wikipedia标题的类别可以帮助补充Wikipedia文本之外的标题含义,因此应利用类别来改善标题的嵌入。我们提出了使用类别的两个方向,并与基于上下文的常规方法配合使用来生成Wikipedia标题的嵌入。我们对中文维基百科上生成的标题嵌入进行了广泛的大规模实验。单词相似性任务和类比推理任务的实验表明,我们的方法明显优于传统的基于上下文的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号