首页> 外文会议>Workshop of natural language processing for improving textual accessibility >A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity
【24h】

A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity

机译:一种生成文本复杂性无偏见估计的两级方法

获取原文

摘要

Many existing approaches for measuring text complexity tend to overestimate the complexity levels of informational texts while simultaneously underestimating the complexity levels of literary texts. We present a two-stage estimation technique that successfully addresses this problem. At Stage 1, each text is classified into one or another of three possible genres: informational, literary or mixed. Next, at Stage 2, a complexity score is generated for each text by applying one or another of three possible prediction models: one optimized for application to informational texts, one optimized for application to literary texts, and one optimized for application to mixed texts. Each model combines lexical, syntactic and discourse features, as appropriate, to best replicate human complexity judgments. We demonstrate that resulting text complexity predictions are both unbiased, and highly correlated with classifications provided by experienced educators.
机译:许多用于测量文本复杂性的现有方法往往估计信息文本的复杂性水平,同时低估了文学文本的复杂性水平。我们提出了一种成功解决这个问题的两级估计技术。在第1阶段,每个文本被分类为三种可能的类型中的一个或另一个:信息,文学或混合。接下来,在第2阶段,通过应用三个可能的预测模型中的一个或另一个文本来生成复杂性分数:一个优化用于信息文本的一个,用于应用于文学文本的一个,以及用于应用于混合文本的一个优化。每个型号都将词汇,句法和话语特征相结合,以适当的是,最佳复制人类复杂性判断。我们证明,结果文本复杂性预测既不偏见,又与经验丰富的教育者提供的分类高度相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号