首页> 外文会议>LREC-2012 >Morphosyntactic Analysis of the CHILDES and TalkBank Corpora
【24h】

Morphosyntactic Analysis of the CHILDES and TalkBank Corpora

机译:童话职业分析童话博士集团

获取原文

摘要

This paper describes the construction and usage of the MOR and GRASP programs for part of speech tagging and syntactic dependency analysis of the corpora in the CHILDES and TalkBank databases. We have written MOR grammars for 11 languages and GRASP analyses for three. For English data, the MOR tagger reaches 98% accuracy on adult corpora and 97% accuracy on child language corpora. The paper discusses the construction of MOR lexicons with an emphasis on compounds and special conversational forms. The shape of rules for controlling allomorphy and morpheme concatenation are discussed. The analysis of bilingual corpora is illustrated in the context of the Cantonese-English bilingual corpora. Methods for preparing data for MOR analysis and for developing MOR grammars are discussed. We believe that recent computational work using this system is leading to significant advances in child language acquisition theory and theories of grammar identification more generally.
机译:本文介绍了MOR和掌握程序的构建和使用,是童话标记的一部分语音标记和童话数据库中的语法依赖性分析。我们为11种语言编写了Mor语法,并掌握了三个分析。对于英语数据,Moragger对成人语料库的准确性为98%,对儿童语言集团的准确性有97%。本文讨论了Mor Lexicons的建设,重点是化合物和特殊的对话形式。讨论了控制血管和语素级联的规则的形状。在粤语 - 英语双语语料库的背景下说明了双语语料库的分析。讨论了对MOR分析和发展MOR语法的制备数据的方法。我们认为,最近使用该系统的计算工作导致儿童语言采集理论和语法识别理论的重要进展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号