Classifying easy-to-read texts without parsing

机译：无需解析即可对易于阅读的文本进行分类

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Document classification using automated linguistic analysis and machine learning (ML) has been shown to be a viable road forward for readability assessment. The best models can be trained to decide if a text is easy to read or not with very high accuracy, e.g. a model using 117 parameters from shallow, lexical, morphological and syntactic analyses achieves 98,9% accuracy. In this paper we compare models created by parameter optimization over subsets of that total model to find out to which extent different high-performing models tend to consist of the same parameters and if it is possible to find models that only use features not requiring parsing. We used a genetic algorithm to systematically optimize parameter sets of fixed sizes using accuracy of a Support Vector Machine classifier as fitness function. Our results show that it is possible to find models almost as good as the currently best models while omitting parsing based features.

机译：使用自动语言分析和机器学习（ML）进行文档分类已被证明是可读性评估的可行之路。可以训练最好的模型，以决定文本是否易于阅读，并且准确性很高。使用来自浅层，词汇，形态和句法分析的117个参数的模型，可以达到98.9％的准确性。在本文中，我们将通过参数优化创建的模型与整个模型的子集进行比较，以找出不同的高性能模型倾向于由相同参数组成的程度，以及是否有可能找到仅使用不需要解析的特征的模型。我们使用支持向量机分类器的准确性作为适应度函数，使用遗传算法来系统地优化固定大小的参数集。我们的结果表明，在省略基于特征的分析的同时，可以找到几乎与当前最佳模型一样好的模型。

著录项

来源
《3rd Workshop on predicting and improving textreadability for target reader populations》|2014年|114-122|共9页
会议地点 Gothenburg(SE)
作者
Johan Falkenjack; Arne Joensson;
展开▼
作者单位

Department of Information and Computer Science Linkoeping University 581 83, Linkoeping, Sweden;

Department of Information and Computer Science Linkoeping University 581 83, Linkoeping, Sweden;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Easy-to-read texts for students with intellectual disability: Linguistic factors affecting comprehension [J] . FajardoI., ávilaV., FerrerA., Journal of applied research in intellectual disabilities: JARID . 2014,第3期

机译：智障学生易于阅读的文本：影响理解的语言因素
2. Parsing clinical text using the state-of-the-art deep learning based parsers: a systematic comparison [J] . Yaoyun Zhang, Firat Tiryaki, Min Jiang, BMC Medical Informatics and Decision Making . 2019,第3期

机译：使用基于深度学习的最新解析器解析临床文本：系统比较
3. Parsing clinical text: how good are the state-of-the-art parsers? [J] . Min Jiang, Yang Huang, Jung-wei Fan, BMC Medical Informatics and Decision Making . 2015,第SUPPLEMENTa1期

机译：解析临床文本：最新的解析器有多好？
4. Classifying easy-to-read texts without parsing [C] . Johan Falkenjack, Arne Joensson Conference of the European Chapter of the Association for Computational Linguistics . 2014

机译：在没有解析的情况下对易于阅读的文本进行分类
5. Learning structured classifiers for statistical dependency parsing [D] . Wang, Qin Iris 2008

机译：学习结构化分类器以进行统计依赖性解析
6. Parsing clinical text using the state-of-the-art deep learning based parsers: a systematic comparison [O] . Yaoyun Zhang, Firat Tiryaki, Min Jiang, 2019

机译：使用基于深度学习的最新解析器解析临床文本：系统比较
7. Classifying easy-to-read texts without parsing [O] . Johan Falkenjack, Arne Jönsson 2015

机译：无需解析即可对易于阅读的文本进行分类

Classifying easy-to-read texts without parsing

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅