首页> 外文会议>IEEE Automatic Speech Recognition and Understanding Workshop >Enhanced Bert-Based Ranking Models for Spoken Document Retrieval
【24h】

Enhanced Bert-Based Ranking Models for Spoken Document Retrieval

机译:增强基于BERT的排名模型,用于说明书检索

获取原文

摘要

The Bidirectional Encoder Representations from Transformers (BERT) model has recently achieved record-breaking success on many natural language processing (NLP) tasks such as question answering and language understanding. However, relatively little work has been done on ad-hoc information retrieval (IR), especially for spoken document retrieval (SDR). This paper adopts and extends BERT for SDR, while its contributions are at least three-fold. First, we augment BERT with extra language features such as unigram and inverse document frequency (IDF) statistics to make it more applicable to SDR. Second, we also explore the incorporation of confidence scores into document representations to see if they could help alleviate the negative effects resulting from imperfect automatic speech recognition (ASR). Third, we conduct a comprehensive set of experiments to compare our BERT-based ranking methods with other state-of-the-art ones and investigate the synergy effect of them as well.
机译:来自变换器(BERT)模型的双向编码器表示最近在许多自然语言处理(NLP)任务中取得了重大记录成功,例如问题应答和语言理解。但是,在Ad-hoc信息检索(IR)上已经采取了相对较少的工作,特别是对于口头文档检索(SDR)。本文采用并延伸了SDR的BERT,而其贡献至少为三倍。首先,我们使用额外的语言功能(如UNIGRAM和逆文档频率(IDF)统计数据)增强BERT,使其更适用于SDR。其次,我们还探讨了置信度分数的融入文件表示,以了解它们是否有助于缓解不完美的自动语音识别(ASR)产生的负面影响。第三,我们开展一套全面的实验,以将基于BERT的排名方法与其他最先进的实验进行比较,并调查它们的协同效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号