【24h】

Estimating the Quality of Crowdsourced Translations Based on the Characteristics of Source and Target Words and Participants

机译:基于源,目标词和参与者特征的众包翻译质量评估

获取原文

摘要

Text-based media possess a wealth of insights that can be mined to understand perceptions and actions. Researchers and public officials can use these data to inform development policy and humanitarian action. An important step in analyzing text-based databases, such as social media, is the creation of taxonomies which are used to filter information relevant to topics of interest. We worked with thousands of online volunteers to translate 2,137 keywords or phrases in English to formal or vernacular expressions in 29 different languages with the aim of understanding human responses to natural disasters, as well as developing sets of corpus on non popular languages (non English and non EU languages) which still has limited studies. In processing the data set, we faced a challenge in selecting a set of quality translations for each language. This paper aims to estimate the quality of the crowdsourced translations by non-professional translators. This paper presents an extensive empirical study using 91 features from 29 languages corpora to describe (a) translators, (b) source expressions, and (c) translated expressions. Our results show that our approach exploring two regression models and two supervised learning methods produces better results than a baseline approach with a commonly used metric, namely peer-review scores.
机译:基于文本的媒体拥有丰富的见解,可以挖掘这些见解来理解看法和行动。研究人员和政府官员可以使用这些数据来指导发展政策和人道主义行动。分析基于文本的数据库(例如社交媒体)的重要步骤是创建分类法,该分类法用于过滤与感兴趣主题相关的信息。我们与成千上万的在线志愿者合作,将29种不同语言的2137个英语关键字或短语翻译成正式或白话表达形式,目的是理解人类对自然灾害的反应,并开发针对非流行语言(非英语和英语)的语料库。非欧盟语言),但研究仍然有限。在处理数据集时,我们面临着为每种语言选择一组高质量翻译的挑战。本文旨在评估非专业翻译人员众包翻译的质量。本文使用来自29种语言的语料库的91种功能进行了广泛的实证研究,以描述(a)译者,(b)源表达和(c)翻译表达。我们的结果表明,与使用常用指标(即同行评议分数)的基线方法相比,探索两种回归模型和两种监督学习方法的方法产生了更好的结果。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号