首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >Sign constraints on feature weights improve a joint model of word segmentation and phonology
【24h】

Sign constraints on feature weights improve a joint model of word segmentation and phonology

机译:特征权重上的符号约束改善了分词和语音的联合模型

获取原文

摘要

This paper describes a joint model of word segmentation and phonological alternations, which takes unsegmented utterances as input and infers word segmentations and underlying phonological representations. The model is a Maximum Entropy or log-linear model, which can express a probabilistic version of Opti-mality Theory (OT; Prince and Smolensky (2004)), a standard phonological framework. The features in our model are inspired by OT's Markedness and Faithfulness constraints. Following the OT principle that such features indicate "violations", we require their weights to be non-positive. We apply our model to a modified version of the Buckeye corpus (Pitt et al., 2007) in which the only phonological alternations are deletions of word-final /d/ and /t/ segments. The model sets a new state-of-the-art for this corpus for word segmentation, identification of underlying forms, and identification of /d/ and /t/ deletions. We also show that the OT-inspired sign constraints on feature weights are crucial for accurate identification of deleted /d/s; without them our model posits approximately 10 times more deleted underlying /d/s than appear in the manually annotated data.
机译:本文描述了分词和语音交替的联合模型,该模型以未分段的语音作为输入,并推断分词和基础的语音表示。该模型是最大熵或对数线性模型,可以表达标准语音系统的最优论(OT; Prince and Smolensky(2004))的概率版本。我们模型中的功能受OT的“标记性”和“忠实性”约束的启发。遵循旧约原则,这些特征表示“违规”,我们要求它们的权重为非正值。我们将我们的模型应用于七叶树语料库的修改版本(Pitt等人,2007),其中唯一的语音变化是单词最终的/ d /和/ t /段的删除。该模型为此语料库设置了一个新的最新技术,用于单词分割,识别基础形式以及识别/ d /和/ t /删除。我们还显示了OT启发的特征权重约束对于准确识别已删除的/ d / s至关重要。没有它们,我们的模型所存储的已删除基础/ d / s大约比手动注释的数据多10倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号