首页> 外文会议>9th International conference on language resources and evaluation >Bootstrapping Term Extractors for Multiple Languages
【24h】

Bootstrapping Term Extractors for Multiple Languages

机译:用于多种语言的引导项提取器

获取原文

摘要

Terminology extraction resources are needed for a wide range of human language technology applications, including knowledge management, information extraction, semantic search, cross-language information retrieval and automatic and assisted translation. We report a low cost method for creating terminology extraction resources for 21 non-English EU languages. Using parallel corpora and a projection method, we create a General POS Tagger for these languages. We also investigate the use of EuroVoc terms and Wikipedia to automatically create a term grammar for each language. Our results show that these automatically generated resources can assist the term extraction process, achieving similar performance to manually generated resources. All POS tagger and term grammar resources resulting from this work are freely available for download.
机译:需要术语提取资源,需要广泛的人类语言技术应用,包括知识管理,信息提取,语义搜索,跨语言信息检索和自动和辅助翻译。我们报告了为21个非英语欧盟语言创建术语提取资源的低成本方法。使用并行语料库和投影方法,我们为这些语言创建一般POS标记器。我们还调查EUROVOC术语和维基百科的使用,自动为每种语言创建一个术语语法。我们的结果表明,这些自动生成的资源可以帮助术语提取过程,实现类似的性能来手动生成资源。所有POS标记和术语语法资源都可以自由地下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号