首页> 外文学位 >Natural language processing tools for reading level assessment and text simplification for bilingual education.
【24h】

Natural language processing tools for reading level assessment and text simplification for bilingual education.

机译:用于双语水平阅读水平评估和文本简化的自然语言处理工具。

获取原文
获取原文并翻译 | 示例

摘要

Reading proficiency is a fundamental component of language competency. However, finding topical texts at an appropriate level for foreign and second language learners is a challenge for teachers. We address this problem using natural language processing technology to assess reading level and simplify text. In the context of foreign- and second-language learning, existing measures of reading level are not well-suited to this task. Related work has shown the benefit of using statistical language processing techniques; we extend these ideas and include other potential features to measure readability. In the first part of this dissertation we combine features from statistical language models, traditional reading level measures, and other language processing tools to produce a better method of detecting reading level. We discuss the performance of human annotators and evaluate results for our detectors with respect to human ratings. A key contribution is that our detectors are trainable; with training and test data from the same domain, our detectors outperform more general reading level tools (Flesch-Kincaid and Lexile). Trainability will allow performance to be tuned to address the needs of particular groups or students. Next, these tools are extended to enable teachers to more effectively take advantage of the large amounts of text available on the World Wide Web. The tools are augmented to handle web pages returned by a search engine, including filtering steps to eliminate "junk" pages with little or no text. These detectors are manually evaluated by elementary school teachers, the intended audience. We also explore adapting the detectors to the opinions of individual teachers.; In the second part of the dissertation we address the task of text simplification in the context of language learning. We begin by analyzing pairs of original and manually simplified news articles to learn what people most often do when adapting text. Based on this analysis, we investigate two steps in simplification: choosing sentences to keep and splitting sentences. We study existing summarization and syntactic simplification tools applied to these steps and discuss other data-driven methods which in the future could be tuned to particular corpora or users.
机译:阅读能力是语言能力的基本组成部分。但是,为外语和第二语言学习者找到合适水平的主题课文对老师来说是一个挑战。我们使用自然语言处理技术来解决此问题,以评估阅读水平并简化文本。在外语和第二语言学习的背景下,现有的阅读水平衡量标准并不适合此任务。相关工作表明了使用统计语言处理技术的好处。我们扩展了这些想法,并包括其他潜在功能来衡量可读性。在本文的第一部分,我们结合了统计语言模型,传统阅读水平测评和其他语言处理工具的功能,以提供一种更好的检测阅读水平的方法。我们讨论人类注释器的性能,并针对人类评级评估检测器的结果。一个关键的贡献是我们的探测器是可训练的。借助来自同一领域的培训和测试数据,我们的检测器的性能优于更一般的阅读水平工具(Flesch-Kincaid和Lexile)。可培训性将使绩效得以调整,以满足特定群体或学生的需求。接下来,扩展这些工具以使教师能够更有效地利用万维网上可用的大量文本。这些工具得到了增强,可以处理由搜索引擎返回的网页,包括过滤步骤以消除文本很少或根本没有的“垃圾”页面。这些检测器由小学老师(目标受众)手动评估。我们还将探索使检测器适应各个教师的意见。在论文的第二部分,我们在语言学习的背景下解决了文本简化的任务。我们首先分析成对的原始新闻和人工简化的新闻,以了解人们在改编文本时最常做的事情。基于此分析,我们研究了两个简化步骤:选择要保留的句子和拆分句子。我们研究了适用于这些步骤的现有摘要和语法简化工具,并讨论了其他数据驱动的方法,这些方法将来可能会针对特定的语料库或用户进行调整。

著录项

  • 作者

    Petersen, Sarah E.;

  • 作者单位

    University of Washington.;

  • 授予单位 University of Washington.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 125 p.
  • 总页数 125
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号