Enhanced Bert-Based Ranking Models for Spoken Document Retrieval

机译：增强基于BERT的排名模型，用于说明书检索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Bidirectional Encoder Representations from Transformers (BERT) model has recently achieved record-breaking success on many natural language processing (NLP) tasks such as question answering and language understanding. However, relatively little work has been done on ad-hoc information retrieval (IR), especially for spoken document retrieval (SDR). This paper adopts and extends BERT for SDR, while its contributions are at least three-fold. First, we augment BERT with extra language features such as unigram and inverse document frequency (IDF) statistics to make it more applicable to SDR. Second, we also explore the incorporation of confidence scores into document representations to see if they could help alleviate the negative effects resulting from imperfect automatic speech recognition (ASR). Third, we conduct a comprehensive set of experiments to compare our BERT-based ranking methods with other state-of-the-art ones and investigate the synergy effect of them as well.

机译：来自变换器（BERT）模型的双向编码器表示最近在许多自然语言处理（NLP）任务中取得了重大记录成功，例如问题应答和语言理解。但是，在Ad-hoc信息检索（IR）上已经采取了相对较少的工作，特别是对于口头文档检索（SDR）。本文采用并延伸了SDR的BERT，而其贡献至少为三倍。首先，我们使用额外的语言功能（如UNIGRAM和逆文档频率（IDF）统计数据）增强BERT，使其更适用于SDR。其次，我们还探讨了置信度分数的融入文件表示，以了解它们是否有助于缓解不完美的自动语音识别（ASR）产生的负面影响。第三，我们开展一套全面的实验，以将基于BERT的排名方法与其他最先进的实验进行比较，并调查它们的协同效果。

著录项

来源
《IEEE Automatic Speech Recognition and Understanding Workshop》|2019年|1 v.|共6页
会议地点
作者
Hsiao-Yun Lin; Tien-Hong Lo; Berlin Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电声技术和语音信号处理;
关键词
Bit error rate; Training; Task analysis; Context modeling; Solid modeling; Data models; Natural language processing;

机译：误码率;训练;任务分析;上下文建模;实体建模;数据模型;自然语言处理;

相似文献

外文文献
中文文献
专利

1. A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization [J] . SHIH-HSIANG LIN, BERLIN CHEN, HSIN-MIN WANG ACM transactions on Asian language information processing . 2009,第1期

机译：汉语口语文摘概率等级模型的比较研究
2. Statistical language models for query-by-example spoken document retrieval [J] . Paula Lopez-Otero, Javier Parapar, Alvaro Barreiro Multimedia Tools and Applications . 2020,第11a12期

机译：逐个示例统计语言模型进行查询语音文档检索
3. Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques [J] . Kuan-Yu CHEN, Hsin-Min WANG, Berlin CHEN IEICE transactions on information and systems . 2012,第5期

机译：利用无监督和受监督主题建模技术的语音文档检索
4. Enhanced Bert-Based Ranking Models for Spoken Document Retrieval [C] . Hsiao-Yun Lin, Tien-Hong Lo, Berlin Chen IEEE Automatic Speech Recognition and Understanding Workshop . 2019

机译：增强的基于Bert的语音文档检索排名模型
5. Combinatoric models of information retrieval ranking methods and performance measures for weakly-ordered document collections. [D] . Church, Lewis. 2010

机译：信息检索排序方法和性能度量的组合模型，用于弱序文档收集。
6. Document Re-ranking by Generality in Bio-medical Information Retrieval [O] . Xin Yan, Xue Li, Dawei Song -1

机译：生物医学信息检索中的文献重新排序
7. An LDA-smoothed relevance model for document expansion: a case study for spoken document retrieval [O] . Ganguly Debasis, Leveling Johannes, Jones Gareth J.F. 2013

机译：用于文档扩展的LDa平滑相关模型：用于语音文档检索的案例研究

Enhanced Bert-Based Ranking Models for Spoken Document Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅