首页> 外文会议>9th International conference on language resources and evaluation >Bridging the gap between speech technology and natural language processing: an evaluation toolbox for term discovery systems
【24h】

Bridging the gap between speech technology and natural language processing: an evaluation toolbox for term discovery systems

机译:缩小语音技术和自然语言处理之间的差距:术语发现系统的评估工具箱

获取原文

摘要

The unsupervised discovery of linguistic terms from either continuous phoneme transcriptions or from raw speech has seen an increasing interest in the past years both from a theoretical and a practical standpoint Yet, there exists no common accepted evaluation method for the systems performing term discovery. Here, we propose such an evaluation toolbox, drawing ideas from both speech technology and natural language processing. We first transform the speech-based output into a symbolic representation and compute five types of evaluation metrics on this representation: the quality of acoustic matching, the quality of the clusters found, and the quality of the alignment with real words (type, token, and boundary scores). We tested our approach on two term discovery systems taking speech as input, and one using symbolic input. The latter was run using both the gold transcription and a transcription obtained from an automatic speech recognizer, in order to simulate the case when only imperfect symbolic information is available. The results obtained are analysed through the use of the proposed evaluation metrics and the implications of these metrics are discussed.
机译:从理论和实践的角度来看,无论是从连续音素转录还是从原始语音中对语言术语进行无监督的发现,在过去几年中都引起了越来越多的兴趣。然而,对于执行术语发现的系统,没有普遍接受的评估方法。在这里,我们提出了一种评估工具箱,从语音技术和自然语言处理中汲取了思想。我们首先将基于语音的输出转换为符号表示形式,然后根据该表示形式计算五种评估指标:声学匹配的质量,找到的簇的质量以及与真实单词(类型,标记,和边界分数)。我们在以语音为输入的两个术语发现系统和使用符号输入的一个术语发现系统上测试了我们的方法。后者使用黄金转录和从自动语音识别器获得的转录进行运行,以模拟只有不完善的符号信息可用时的情况。通过使用建议的评估指标来分析获得的结果,并讨论这些指标的含义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号