首页> 外文会议>International conference on text, speech and dialogue >Incremental Dependency Parsing and Disfluency Detection in Spoken Learner English
【24h】

Incremental Dependency Parsing and Disfluency Detection in Spoken Learner English

机译:口语学习者英语中的增量依赖项解析和不满检测

获取原文

摘要

This paper investigates the suitability of state-of-the-art natural language processing (NLP) tools for parsing the spoken language of second language learners of English. The task of parsing spoken learner-language is important to the domains of automated language assessment (ALA) and computer-assisted language learning (CALL). Due to the non-canonical nature of spoken language (containing filled pauses, non-standard grammatical variations, hesitations and other disfluencies) and compounded by a lack of available training data, spoken language parsing has been a challenge for standard NLP tools. Recently the Redshift parser (Honnibal et al. In: Proceedings of CoNLL (2013)) has been shown to be successful in identifying grammatical relations and certain disfluencies in native speaker spoken language, returning unlabelled dependency accuracy of 90.5% and a disfluency F-measure of 84.1% (Honnibal & Johnson: TACL 2, 131-142 (2014)). We investigate how this parser handles spoken data from learners of English at various proficiency levels. Firstly, we find that Redshift's parsing accuracy on non-native speech data is comparable to Honnibal & Johnson's results, with 91.1% of dependency relations correctly identified. However, disfluency detection is markedly down, with an F-measure of just 47.8%. We attempt to explain why this should be, and investigate the effect of proficiency level on parsing accuracy. We relate our findings to the use of NLP technology for CALL and ALA applications.
机译:本文研究了最先进的自然语言处理(NLP)工具用于解析英语第二语言学习者的口语的适用性。解析口语学习者的语言的任务对于自动语言评估(ALA)和计算机辅助语言学习(CALL)的领域很重要。由于口头语言的非规范性(包含充实的停顿,非标准的语法变化,犹豫和其他不满),加上缺乏可用的培训数据,使得口头语言解析一直是标准NLP工具的挑战。最近,Redshift解析器(Honnibal等人:CoNLL的论文集(2013))已被证明可以成功识别语法关系和母语人士的某些不满,返回无标签的依存准确度为90.5%和不满F测度占84.1%(Honnibal&Johnson:TACL 2,131-142(2014))。我们研究此解析器如何处理各种熟练程度的英语学习者的口语数据。首先,我们发现Redshift对非本地语音数据的解析准确性与Honnibal&Johnson的结果相当,正确识别了91.1%的依赖关系。但是,不满度检测显着降低,F测度仅为47.8%。我们试图解释为什么会这样,并研究熟练程度对解析准确性的影响。我们将我们的发现与在CALL和ALA应用中使用NLP技术相关联。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号