【24h】

The construction of a Chinese interlanguage corpus

机译:汉语中介语语料库的构建

获取原文
获取原文并翻译 | 示例

摘要

A Chinese interlanguage corpus lays a foundation of studying speech production, such as the typical pronunciation errors, of non-native Chinese speakers. Traditional Chinese interlanguage corpus has difficulty in covering important phonetic types such tones, syllables with context. This paper presents a construction of an interlanguage corpus which contains 103 sentences covering 394 syllable types and 174 tri-tone types bounded by prosodic boundary using a modified least-to-most-ordered algorithm. The corpus uses about half the size of traditional interlanguage corpus the Conversational Chinese 301 while achieving better coverage and less uneven distribution of syllable type and tri-tone type. More than 80% of words of the interlanguage corpus can be found in the word list of HSK4; about 13% of words found HSK5 and HSK6; about 3% beyond the vocabulary of HSK6.
机译:汉语中介语语料库为研究非英语母语者的语音产生(例如典型的发音错误)奠定了基础。繁体中文中介语语料库很难覆盖重要的语音类型,例如语调,音节和上下文。本文提出了一种中介语语料库的构造,该语料库包含103个句子,涵盖了394个音节类型和174个由韵律边界界定的三音类型,并使用了改进的最小到最高级算法。该语料库使用的大小仅为传统会话语言语料库(会话汉语301)的一半,同时实现了更好的覆盖率,并减少了音节型和三音型的不均匀分布。在HSK4的单词表中,可以找到超过80%的中介语单词。大约有13%的单词找到了HSK5和HSK6;大约比HSK6的词汇多3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号