首页> 外文会议>International Conference on Text, Speech and Dialogue >Semantic Role Labeling of Speech Transcripts Without Sentence Boundaries
【24h】

Semantic Role Labeling of Speech Transcripts Without Sentence Boundaries

机译:语义角色在没有句子边界的语音成绩单的标记

获取原文

摘要

Speech data is an extremely rich and important source of information. However, we lack suitable methods for the semantic annotation of speech data. For instance, semantic role labeling (SRL) of speech that has been transcribed by an automated speech recognition (ASR) system is still an unsolved problem. SRL of ASR data is difficult and complex due to the absence of sentence boundaries, punctuation, grammar errors, words that are wrongly transcribed, and word deletions and insertions. In this paper we propose a novel approach to SRL of ASR data based on the following idea: (1) train the SRL system on data segmented into frames, where each frame consists of a predicate and its semantic roles without considering sentence boundaries; (2) label it with the semantics of PropBank roles; and to assist the above (3) train a part-of-speech (POS) tagger to work on noisy and error prone ASR data. Experiments with the OntoNotes corpus show improvements compared to the state-of-the-art SRL applied on ASR data.
机译:语音数据是一个非常丰富和重要的信息来源。但是,我们缺乏合适的语音数据注释方法。例如,由自动语音识别(ASR)系统转录的语音的语义角色标记(SRL)仍然是一个未解决的问题。由于缺少句子边界,标点符号,语法错误,错误转录的单词,以及字删除和插入,因此难以和复杂的SRL。在本文中,我们提出了一种基于以下思想的新颖方法,基于以下想法:(1)将SRL系统分为帧中的数据,其中每个帧包括谓词及其语义角色而不考虑句子边界; (2)用Propbank角色的语义标记它;并协助以上(3)列车致辞(POS)标签,以工作嘈杂并易于易于ASR数据。与ASR数据上应用的最先进的SRL相比,onototootes语料库的实验显示改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号