A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity

机译：一种生成文本复杂性无偏见估计的两级方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many existing approaches for measuring text complexity tend to overestimate the complexity levels of informational texts while simultaneously underestimating the complexity levels of literary texts. We present a two-stage estimation technique that successfully addresses this problem. At Stage 1, each text is classified into one or another of three possible genres: informational, literary or mixed. Next, at Stage 2, a complexity score is generated for each text by applying one or another of three possible prediction models: one optimized for application to informational texts, one optimized for application to literary texts, and one optimized for application to mixed texts. Each model combines lexical, syntactic and discourse features, as appropriate, to best replicate human complexity judgments. We demonstrate that resulting text complexity predictions are both unbiased, and highly correlated with classifications provided by experienced educators.

机译：许多用于测量文本复杂性的现有方法往往估计信息文本的复杂性水平，同时低估了文学文本的复杂性水平。我们提出了一种成功解决这个问题的两级估计技术。在第1阶段，每个文本被分类为三种可能的类型中的一个或另一个：信息，文学或混合。接下来，在第2阶段，通过应用三个可能的预测模型中的一个或另一个文本来生成复杂性分数：一个优化用于信息文本的一个，用于应用于文学文本的一个，以及用于应用于混合文本的一个优化。每个型号都将词汇，句法和话语特征相结合，以适当的是，最佳复制人类复杂性判断。我们证明，结果文本复杂性预测既不偏见，又与经验丰富的教育者提供的分类高度相关。

著录项

来源
《Workshop of natural language processing for improving textual accessibility》|2013年||共10页
会议地点
作者
Kathleen M. Sheehan; Michael Flor; Diane Napolitano;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. A Two-Stage Authorship Attribution Method Using Text and Structured Data for De-Anonymizing User-Generated Content [J] . Matthew J. Schneider, Shawn Mankad Customer Needs and Solutions . 2021,第3期

机译：使用文本和结构化数据的两阶段作者属性方法，用于取消匿名用户生成的内容
2. Characterization of best linear unbiased estimates generated from national genetic evaluations of reproductive performance, survival, and milk yield in dairy cows [J] . Dunne F. L., Kelleher M. M., Walsh S. W., Journal of dairy science . 2018,第8期

机译：最佳线性无偏估计值的表征，该估计值是通过对奶牛的生殖性能，存活率和产奶量进行国家遗传评估而得出的
3. Characterisation of best linear unbiased estimates generated from national genetic evaluations ofreproductive performance, survival, and milk production in dairy cows [J] . F L Dunne, M M Kelleher, S W Walsh, Advances in Animal Biosciences . 2018,第1期

机译：乳制品奶牛牛奶产量，生存和牛奶生产中的国家遗传评估中最佳线性无偏见估计的表征
4. A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity [C] . Kathleen M. Sheehan, Michael Flor, Diane Napolitano Second workshop of natural language processing for improving textual accessibility . 2013

机译：生成文本复杂度的无偏估计的两阶段方法
5. Manipulating Comprehensibility of Text: An Automated Approach to Generate Deceptive Documents for Cyber Defense [D] . Karuna, Prakruthi 2019

机译：操纵文本的可理解性：一种用于生成网络防御欺骗性文件的自动化方法
6. A robust (re-)annotation approach to generate unbiased mapping references for RNA-seq-based analyses of differential expression across closely related species [O] . Montserrat Torres-Oliva, Isabel Almudi, Alistair P. McGregor, 2016

机译：一种健壮的（重新）注释方法可为基于RNA-seq的密切相关物种之间的差异表达分析生成无偏倚的映射参考
7. QUANTILE-BASED APPROACH TO ESTIMATING COGNITIVE TEXT COMPLEXITY [O] . M. A. Eremeev, K. V. Vorontsov 2020

机译：基于分类的估算认知文本复杂性的方法

A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity

摘要

著录项

相似文献

相关主题

期刊订阅