首页> 外文会议>Advanced language technologies for digital libraries >Automatic Gazetteer Generation from Wikipedia
【24h】

Automatic Gazetteer Generation from Wikipedia

机译:维基百科自动生成地名词典

获取原文
获取原文并翻译 | 示例

摘要

The presence of high quality Named Entity gazetteer within a CLIR system is crucial in order to provide multilingual access to digital resources, particularly in the domain of Digital Libraries. In our paper we investigate an approach for automatically extracting this kind of resources from Wikipedia using an unsupervised approach that leverages the DBpedia classification of the English articles in order to induce the same classification onto encyclopedia pages expressed in other languages. By exploiting the structured information present in Wikipedia we furthermore aim at enriching our standard gazetteer with translations to other languages as well as with the alternative spellings of the entities.
机译:为了提供对数字资源的多语言访问,特别是在数字图书馆领域,CLIR系统中高质量的命名实体地名词典的存在至关重要。在本文中,我们研究了一种使用无监督方法从Wikipedia自动提取此类资源的方法,该方法利用了英语文章的DBpedia分类,以便将相同的分类引入其他语言表达的百科全书页上。通过利用Wikipedia中存在的结构化信息,我们进一步旨在通过翻译其他语言以及实体的替代拼写来丰富我们的标准地名词典。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号