首页> 外文OA文献 >Natural language inference for Malayalam language using language agnostic sentence representation
【2h】

Natural language inference for Malayalam language using language agnostic sentence representation

机译:使用语言无关句子表示的Malayalam语言的自然语言推断

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Natural language inference (NLI) is an essential subtask in many natural language processing applications. It is a directional relationship from premise to hypothesis. A pair of texts is defined as entailed if a text infers its meaning from the other text. The NLI is also known as textual entailment recognition, and it recognizes entailed and contradictory sentences in various NLP systems like Question Answering, Summarization and Information retrieval systems. This paper describes the NLI problem attempted for a low resource Indian language Malayalam, the regional language of Kerala. More than 30 million people speak this language. The paper is about the Malayalam NLI dataset, named MaNLI dataset, and its application of NLI in Malayalam language using different models, namely Doc2Vec (paragraph vector), fastText, BERT (Bidirectional Encoder Representation from Transformers), and LASER (Language Agnostic Sentence Representation). Our work attempts NLI in two ways, as binary classification and as multiclass classification. For both the classifications, LASER outperformed the other techniques. For multiclass classification, NLI using LASER based sentence embedding technique outperformed the other techniques by a significant margin of 12% accuracy. There was also an accuracy improvement of 9% for LASER based NLI system for binary classification over the other techniques.
机译:自然语言推理(NLI)是许多自然语言处理应用程序中的必需子任务。它是从前提到假设的方向关系。如果文本从其他文本中infers infers,则定义了一对文本。 NLI也被称为文本意外识别,它识别出在问题应答,摘要和信息检索系统等各种NLP系统中的额定和矛盾的句子。本文介绍了对喀拉拉邦区域语言的低资源印度语言Malayalam尝试的NLI问题。超过3000万人讲这种语言。本文是关于MALAYALAM NLI DATASET,名为MANLI DATASET的数据集,以及使用不同型号的MALAYALAM语言中的NLI应用,即DOC2VEC(段落向量),FastText,BERT(来自变压器的双向编码器表示),以及语言不可知句子表示)。我们的工作尝试了NLI以两种方式,作为二进制分类和多种多组分类。对于分类,激光器优于其他技术。对于多标量分类,使用基于激光的句子嵌入技术的NLI优先于其他技术的显着余量为12%的精度。对于基于激光的NLI系统,还有9%的准确性提高,用于其他技术的二进制分类。

著录项

  • 作者

    Sara Renjit; Sumam Idicula;

  • 作者单位
  • 年度 2021
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号