Validation of Text Data Preprocessing Using a Neural Network Model

Woo HoSung; Kim JaMee; Lee WonGyu

首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >Validation of Text Data Preprocessing Using a Neural Network Model

【24h】

Validation of Text Data Preprocessing Using a Neural Network Model

机译：使用神经网络模型验证文本数据预处理

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相关主题

摘要

Many artificial intelligence studies focus on designing new neural network models or optimizing hyperparameters to improve model accuracy. To develop a reliable model, appropriate data are required, and data preprocessing is an essential part of acquiring the data. Although various studies regard data preprocessing as part of the data exploration process, those studies lack awareness about the need for separate technologies and solutions for preprocessing. Therefore, this study evaluated combinations of preprocessing types in a text-processing neural network model. Better performance was observed when two preprocessing types were used than when three or more preprocessing types were used for data purification. More specifically, using lemmatization and punctuation splitting together, lemmatization and lowering together, and lowering and punctuation splitting together showed positive effects on accuracy. This study is significant because the results allow better decisions to be made about the selection of the preprocessing types in various research fields, including neural network research.

机译：许多人工智能研究侧重于设计新的神经网络模型或优化超参数以提高模型准确性。为了开发可靠的模型，需要适当的数据，而数据预处理是获取数据的重要组成部分。尽管各种研究将数据预处理视为数据探索过程的一部分，但这些研究缺乏对单独技术和解决方案进行预处理的必要性的认识。因此，本研究评估了文本处理神经网络模型中预处理类型的组合。当使用两种预处理类型时，观察到的性能比使用三种或更多预处理类型进行数据纯化时要好。更具体地说，将词形还原和标点符号拆分在一起，将词形还原和降低一起使用，以及将降低和标点符号拆分在一起，对准确性有积极影响。这项研究意义重大，因为研究结果可以更好地决定在包括神经网络研究在内的各个研究领域中选择预处理类型。

著录项

来源
《Mathematical Problems in Engineering: Theory, Methods and Applications》 |2020年第16期|1958149.1-1958149.9|共9页
作者
Woo HoSung; Kim JaMee; Lee WonGyu;
展开▼
作者单位

Korea Univ, Dept Comp Sci & Engn, Grad Sch, Seoul 02841, South Korea;

Korea Univ, Grad Sch Educ, Major Comp Sci Educ, Seoul 02841, South Korea;

Korea Univ, Coll Informat, Dept Comp Sci & Engn, Seoul 02841, South Korea;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类
关键词

Validation of Text Data Preprocessing Using a Neural Network Model

摘要

著录项

引文网络

相关主题

期刊订阅