Automatic readability assessment.

机译：自动可读性评估。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We describe the development of an automatic tool to assess the readability of text documents. Our readability assessment tool predicts elementary school grade levels of texts with high accuracy. The tool is developed using supervised machine learning techniques on text corpora annotated with grade levels and other indicators of reading difficulty. Various independent variables or features are extracted from texts and used for automatic classification. We systematically explore different feature inventories and evaluate the grade-level prediction of the resulting classifiers. Our evaluation comprises well-known features at various linguistic levels from the existing literature, such as those based on language modeling, part-of-speech, syntactic parse trees, and shallow text properties, including classic readability formulas like the Flesch-Kincaid Grade Level formula. We focus in particular on discourse features, including three novel feature sets based on the density of entities, lexical chains, and coreferential inference, as well as features derived from entity grids. We evaluate and compare these different feature sets in terms of accuracy and mean squared error by cross-validation. Generalization to different corpora or domains is assessed in two ways. First, using two corpora of texts and their manually simplified versions, we evaluate how well our readability assessment tool can discriminate between original and simplified texts. Second, we measure the correlation between grade levels predicted by our tool, expert ratings of text difficulty, and estimated latent difficulty derived from experiments involving adult participants with mild intellectual disabilities. The applications of this work include selection of reading material tailored to varying proficiency levels, ranking of documents by reading difficulty, and automatic document summarization and text simplification.

机译：我们描述了一种自动工具的开发，以评估文本文档的可读性。我们的可读性评估工具可以高精度地预测小学等级的课文水平。该工具是在文本语料库上使用监督的机器学习技术开发的，并标注了年级和其他阅读难度指标。从文本中提取各种自变量或特征，并将其用于自动分类。我们系统地探索不同的功能清单，并评估所得分类器的等级预测。我们的评估包括现有文献在各个语言水平上的知名功能，例如基于语言建模，词性，语法分析树和浅层文本属性的功能，包括经典的可读性公式（例如Flesch-Kincaid等级水平）式。我们特别关注于话语特征，包括基于实体的密度，词汇链和核心推论推断以及从实体网格派生的特征的三个新颖特征集。我们通过交叉验证来评估和比较这些不同的特征集的准确性和均方误差。可以通过两种方式评估对不同语料库或域的泛化。首先，我们使用两个文本集及其手动简化的版本，来评估可读性评估工具对原始文本和简化文本的区分程度。其次，我们测量了由我们的工具预测的年级水平，专家对文本难度的评价以及从涉及轻度智力障碍的成年参与者的实验得出的估计潜在难度之间的相关性。这项工作的应用包括选择适合不同熟练程度的阅读材料，通过阅读难度对文档进行排名以及自动文档摘要和文本简化。

著录项

作者
Feng, Lijun.;
展开▼
作者单位

City University of New York.;

展开▼
授予单位 City University of New York.;
学科 Computer Science.
学位 Ph.D.
年度 2010
页码 204 p.
总页数 204
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Readable Read: Automatic Assessment of Language Learning Materials based on Linguistic Complexity [J] . ILDIKO PLLAN, SOWMYA VAJJALA, ELENA VOLODINA International journal of computational linguistics and applications . 2016,第1期

机译：可读性强：基于语言复杂性的语言学习材料自动评估
2. Using an Automatic Tool to Identify Potential Readability Issues in a Large Sample of Medicinal Package Inserts [J] . Pires C., Cavaco A., Martins F., Methods of information in medicine . 2015,第4期

机译：使用自动工具识别大量药品包装插页中的潜在可读性问题
3. AARI: Automatic Arabic Readability Index [J] . Al-Tamimi Abdel-Karim, Jaradat Manar, Aljarrah Nuha, The international arab journal of information technology . 2014,第4期

机译：AARI：自动阿拉伯文可读性索引
4. Towards Preschoolers’ Automatic Satisfaction Assessment. An Experience Report [C] . Adriana-Mihaela Guran, Grigoreta-Sofia Cojocar, Alexandra Turian IEEE International Symposium on Applied Computational Intelligence and Informatics . 2020

机译：针对学龄前儿童的自动满意度评估。经验报告
5. Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment. [D] . Cox, Troy L. 2013

机译：在自动评分的口语表现评估中调查提示困难。
6. Automatic MeSH term assignment and quality assessment. [O] . W. Kim, A. R. Aronson, W. J. Wilbur 2001

机译：自动MeSH术语分配和质量评估。
7. How Important are Automatic Stabilizers in Europe? A Stochastic Simulation Assessment. [O] . BARRELL Ray, PINA Alvaro M. 100

机译：欧洲的自动稳定剂有多重要？随机模拟评估。

Automatic readability assessment.

摘要

著录项

相似文献

相关主题

期刊订阅