首页> 外文OA文献 >Object-based modelling for representing and processing speech corpora
【2h】

Object-based modelling for representing and processing speech corpora

机译:基于对象的建模,用于表示和处理语音语料库

摘要

This thesis deals with modelling data existing in large speech corpora using an object-oriented paradigm which captures important linguistic structures. Information from corpora is transformed into objects and are assigned properties regarding their behaviour. These objects, called speech units, are placed onto a multi-dimensional framework and have their relationships to other units explicitly defined through the use of links. Frameworks that model temporal utterances or atemporal information like speaker characteristics and recording conditions can be searched efficiently for contextual matches. Speech units that match desired contexts are the result of successful linguistically motivated queries and can be used in further speech processing tasks in the same computational environment. This allows for empirical studies of speech and its relation to linguistic structures to be carried out, and for the training and testing of applications like speech recognition and synthesis.Information residing in typical speech corpora is discussed first, followed by an overview of object-orientation which sets the tone for this thesis. Then the representation framework is introduced which is generated by a compiler and linker that rely on a set of domain-specific resources that transform corpus data into speech units. Operations on this framework are then presented along with a comparison between a relational and object-oriented model of identical speech data.The models described in this work are directly applicable to existing large speech corpora, and the methods developed here are tested against relational database methods. The object-oriented methods outperform the relational methods for typical linguistically relevant queries by about three orders of magnitude as measured by database search times. This improvement in simplicity of representation and search speed is crucial for the utilisation of large multi-lingual corpora in basic research on the detailed properties of speech, especially in relation to contextual variation.
机译:本论文使用捕获重要语言结构的面向对象范例处理大型语音语料库中存在的数据建模。来自语料库的信息将转换为对象,并为其分配有关其行为的属性。这些称为语音单元的对象被放置在多维框架上,并通过使用链接明确定义了它们与其他单元的关系。可以高效地搜索对时间话语或时间信息(例如说话者特征和录音条件)建模的框架,以进行上下文匹配。匹配所需上下文的语音单元是成功的语言动机查询的结果,可用于同一计算环境中的其他语音处理任务。这样可以对语音及其与语言结构的关系进行实证研究,并可以对语音识别和合成等应用进行训练和测试。首先讨论了典型语音语料库中的信息,然后概述了面向对象的概述。这为本文奠定了基调。然后介绍表示框架,该框架由编译器和链接器生成,该编译器和链接器依赖于一组将域数据转换为语音单元的特定于域的资源。然后介绍该框架上的操作以及相同语音数据的关系模型和面向对象模型之间的比较。本工作中描述的模型直接适用于现有的大型语音语料库,并且针对关系数据库方法对此处开发的方法进行了测试。面向对象的方法优于典型的语言相关查询的关系方法(按数据库搜索时间衡量)大约三个数量级。表示简单性和搜索速度的这种提高对于在语音详细属性(尤其是与上下文变化有关)的基础研究中利用大型多语言语料库至关重要。

著录项

  • 作者

    Altosaar Toomas;

  • 作者单位
  • 年度 2001
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号