首页> 外文期刊>The Electronic Library >Bilingual lexical extraction based on word alignment for improving corpus search
【24h】

Bilingual lexical extraction based on word alignment for improving corpus search

机译:基于词对齐的双语词汇提取改善语料库搜索

获取原文
获取原文并翻译 | 示例
       

摘要

Purpose This paper aims to describe the structure of an aligned Serbian-German literary corpus (SrpNemKor) contained in a digital library Biblisa. The goal of the research was to create a benchmark Serbian-German annotated corpus searchable with various query expansions. Design/methodology/approach The presented research is particularly focused on the enhancement of bilingual search queries in a full-text search of aligned SrpNemKor collection. The enhancement is based on using existing lexical resources such as Serbian morphological electronic dictionaries and the bilingual lexical database Termi. Findings For the purpose of this research, the lexical database Termi is enriched with a bilingual list of German-Serbian translated pairs of lexical units. The list of correct translation pairs was extracted from SrpNemKor, evaluated and integrated into Termi. Also, Serbian morphological e-dictionaries are updated with new entries extracted from the Serbian part of the corpus. Originality/value A bilingual search of SrpNemKor in Biblisa is available within the user-friendly platform. The enriched database Termi enables semantic enhancement and refinement of user's search query based on synonyms both in Serbian and German at a very high level. Serbian morphological e-dictionaries facilitate the morphological expansion of search queries in Serbian, thereby enabling the analysis of concepts and concept structures by identifying terms assigned to the concept, and by establishing relations between terms in Serbian and German which makes Biblisa a valuable Web tool that can support research and analysis of SrpNemKor.
机译:目的本文旨在描述数字图书馆Biblisa中包含的对齐的塞尔维亚-德语文学语料库(SrpNemKor)的结构。研究的目的是创建一个基准塞尔维亚语-德语注释语料库,该语料库可通过各种查询扩展进行搜索。设计/方法/方法提出的研究特别侧重于在对齐的SrpNemKor集合的全文本搜索中增强双语搜索查询。增强功能基于使用现有的词汇资源,例如塞尔维亚形态电子词典和双语词汇数据库Termi。调查结果就本研究而言,词汇数据库Termi包含了德语-塞尔维亚语翻译成对的词汇对的双语列表。从SrpNemKor中提取了正确的翻译对列表,对其进行了评估并整合到了Termi中。而且,塞尔维亚语形态电子词典使用从语料库的塞尔维亚语部分提取的新条目进行更新。原创性/价值在用户友好的平台上,可以使用Biblisa中的SrpNemKor进行双语搜索。丰富的数据库Termi可以基于非常高的塞尔维亚语和德语中的同义词实现语义增强和优化用户搜索查询。塞尔维亚语形态电子词典促进了塞尔维亚语中搜索查询的形态扩展,从而能够通过识别分配给该概念的术语并通过在塞尔维亚语和德语中建立术语之间的关系来分析概念和概念结构,这使Biblisa成为了一种有价值的Web工具,可以支持对SrpNemKor的研究和分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号