首页> 外文会议>International conference on intelligent text processing and computational linguistics >Lost in Translation: Viability of Machine Translation for Cross Language Sentiment Analysis
【24h】

Lost in Translation: Viability of Machine Translation for Cross Language Sentiment Analysis

机译:翻译中的失落:机器翻译在跨语言情感分析中的可行性

获取原文

摘要

Recently there has been a lot of interest in Cross Language Sentiment Analysis (CLSA) using Machine Translation (MT) to facilitate Sentiment Analysis in resource deprived languages. The idea is to use the annotated resources of one language (say, L_1) for performing Sentiment Analysis in another language (say, L_2) which does not have annotated resources. The success of such a scheme crucially depends on the availability of a MT system between L_1 and L_2. We argue that such a strategy ignores the fact that a Machine Translation system is much more demanding in terms of resources than a Sentiment Analysis engine. Moreover, these approaches fail to take into account the divergence in the expression of sentiments across languages. We provide strong experimental evidence to prove that even the best of such systems do not outperform a system trained using only a few polarity annotated documents in the target language. Having a very large number of documents in L_1 also does not help because most Machine Learning approaches converge (or reach a plateau) after a certain training size (as demonstrated by our results). Based on our study, we take the stand that languages which have a genuine need for a Sentiment Analysis engine should focus on collecting a few polarity annotated documents in their language instead of relying on CLSA.
机译:最近,人们对使用机器翻译(MT)来促进资源匮乏语言中的情感分析的跨语言情感分析(CLSA)产生了浓厚的兴趣。想法是使用一种语言(例如L_1)的带注释的资源,以另一种没有注释资源的语言(例如L_2)执行情感分析。这种方案的成功关键取决于L_1和L_2之间MT系统的可用性。我们认为,这样的策略忽略了一个事实,即机器翻译系统在资源方面比情感分析引擎要苛刻得多。而且,这些方法未能考虑到跨语言表达情感的差异。我们提供了有力的实验证据,以证明即使是最好的此类系统也不能胜过仅使用目标语言中的少数带有极性注释的文档训练的系统。在L_1中拥有大量文档也无济于事,因为大多数机器学习方法在经过一定训练量后会收敛(或达到平稳状态)(如我们的结果所示)。根据我们的研究,我们认为对情感分析引擎有真正需求的语言应专注于以其语言收集一些带有极性注释的文档,而不是依赖于CLSA。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号