首页> 外文会议>International Conference on Computational Linguistics >The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

【24h】

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

机译：魔鬼在细节：评估基于变压器的粒度任务方法的限制

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Contextual embeddings derived from transformer-based neural language models have shown state-of-the-art performance for various tasks such as question answering, sentiment analysis, and textual similarity in recent years. Extensive work shows how accurately such models can represent abstract, semantic information present in text. In this expository work, we explore a tangent direction and analyze such models' performance on tasks that require a more granular level of representation. We focus on the problem of textual similarity from two perspectives: matching documents on a granular level (requiring embeddings to capture fine-grained attributes in the text), and an abstract level (requiring embeddings to capture overall textual semantics). We empirically demonstrate, across two datasets from different domains, that despite high performance in abstract document matching as expected, contextual embeddings are consistently (and at times, vastly) outperformed by simple baselines like TF-IDF for more granular tasks. We then propose a simple but effective method to incorporate TF-IDF into models that use contextual embeddings, achieving relative improvements of up to 36% on granular tasks.

机译：从基于变压器的神经语言模型导出的上下文嵌入物显示了近年来的问题应答，情绪分析和文本相似性等各种任务的最先进的性能。广泛的工作表明，此类模型如何代表文本中存在的抽象，语义信息。在本次要工作中，我们探索了切线方向并分析了需要更粒度的代表水平的任务的这种模型的性能。我们专注于从两个角度来看文本相似性的问题：匹配文档的粒度水平（要求嵌入的嵌入来捕获文本中的细粒度属性）和抽象水平（要求嵌入捕获整体文本语义）。我们经验展示了来自不同域的两个数据集，尽管随着预期的抽象文件匹配中的高性能，但上下文嵌入始终如一（有时，最大限度地）通过简单的基线，如TF-IDF的简单基线，对于更多粒度任务。然后，我们提出了一种简单但有效的方法，将TF-IDF合并到使用上下文嵌入的模型中，在粒度任务中实现高达36％的相对提高。

著录项

来源
《International Conference on Computational Linguistics 》|2020年|3652-3659|共8页
会议地点
作者
Brihi Joshi; Neil Shah; Francesco Barbieri; Leonardo Neves;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The devil in the details: Thorough assessment of evidence and ethics is needed in evaluating new HIV prevention methods [J] . Dawson L. American journal of bioethics . 2012 ,第6期

机译：细节中的魔鬼：评估新的HIV预防方法需要对证据和道德进行全面评估
2. The devil in the details: Thorough assessment of evidence and ethics is needed in evaluating new HIV prevention methods [J] . DawsonL. The American journal of bioethics: AJOB . 2012 ,第6期

机译：细节中的魔鬼：评估新的HIV预防方法需要彻底评估证据和道德
3. The Devil Is in the Detail: Brain Dynamics in Preparation for a Global–Local Task [J] . Leaver Echo E., Low Kathy A., DiVacri Assunta, Journal of Cognitive Neuroscience . 2015 ,第8期

机译：魔鬼在细节：准备全局任务的大脑动力学
4. THE DEVILS IN THE DETAILS AT DEVILS KITCHEN DAM [C] . Christopher M. Hallahan, Robert T. Saber, Paul Schweiger, Association of State Dam Safety Officials annual conference . 2013

机译：厨房大坝细节中的设备
5. The role of attention in fall avoidance: Evaluation of dual task interference with postural and visual working memory tasks in young versus older adults, does capacity limitation influence postural responses? [D] . Little, Carrie Elaine. 2012

机译：注意在避免跌倒中的作用：评估年轻人和老年人对姿势和视觉工作记忆任务的双重任务干扰，能力限制是否会影响姿势反应？
6. Evaluating sex as a biological variable in preclinical research: the devil in the details [O] . Cara Tannenbaum, Jaclyn M. Schwarz, Janine A. Clayton, 2016

机译：在临床前研究中将性别评估为生物变量：细节中的魔鬼
7. The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks [O] . Brihi Joshi, Neil Shah, Francesco Barbieri, 2020

机译：Devil详细信息：评估基于变压器的粒度任务方法的限制
8. Using Maxwell's Demon to Tame the "Devil in the Details" that are Encountered During System Development. [R] . Richardson, D. 2018

机译：使用麦克斯韦的恶魔来驯服系统开发过程中遇到的“细节中的恶魔”。

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

摘要

著录项

相似文献

相关主题

期刊订阅