首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Speeding up corpus development for linguistic research: language documentation and acquisition in Romansh Tuatschin
【24h】

Speeding up corpus development for linguistic research: language documentation and acquisition in Romansh Tuatschin

机译:加快语言研究语料库开发:罗曼杜本鑫语言文件和收购

获取原文

摘要

In this paper, we present ongoing work for developing language resources and basic NLP tools for an undocumented variety of Romansh, in the context of a language documentation and language acquisition project. Our tools are designed to improve the speed and reliability of corpus annotations for noisy data involving large amounts of code-switching, occurrences of child speech and orthographic noise. Being able to increase the efficiency of language resource development for language documentation and acquisition research also constitutes a step towards solving the data sparsity issues with which researchers have been struggling.
机译:在本文中,我们在语言文档和语言采集项目的上下文中向未记录品种罗曼人进行无证品种的语言资源和基本NLP工具,持续的工作。我们的工具旨在提高语料库注释的速度和可靠性,用于涉及大量代码切换,发生儿童语音和正交噪声的噪声数据。能够提高语言资源开发的语言资源开发效率,收购研究还构成了解决研究人员一直在努力的数据稀疏问题的一步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号