首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >Validation of Text Data Preprocessing Using a Neural Network Model
【24h】

Validation of Text Data Preprocessing Using a Neural Network Model

机译:使用神经网络模型验证文本数据预处理

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Many artificial intelligence studies focus on designing new neural network models or optimizing hyperparameters to improve model accuracy. To develop a reliable model, appropriate data are required, and data preprocessing is an essential part of acquiring the data. Although various studies regard data preprocessing as part of the data exploration process, those studies lack awareness about the need for separate technologies and solutions for preprocessing. Therefore, this study evaluated combinations of preprocessing types in a text-processing neural network model. Better performance was observed when two preprocessing types were used than when three or more preprocessing types were used for data purification. More specifically, using lemmatization and punctuation splitting together, lemmatization and lowering together, and lowering and punctuation splitting together showed positive effects on accuracy. This study is significant because the results allow better decisions to be made about the selection of the preprocessing types in various research fields, including neural network research.
机译:许多人工智能研究侧重于设计新的神经网络模型或优化超参数以提高模型准确性。为了开发可靠的模型,需要适当的数据,而数据预处理是获取数据的重要组成部分。尽管各种研究将数据预处理视为数据探索过程的一部分,但这些研究缺乏对单独技术和解决方案进行预处理的必要性的认识。因此,本研究评估了文本处理神经网络模型中预处理类型的组合。当使用两种预处理类型时,观察到的性能比使用三种或更多预处理类型进行数据纯化时要好。更具体地说,将词形还原和标点符号拆分在一起,将词形还原和降低一起使用,以及将降低和标点符号拆分在一起,对准确性有积极影响。这项研究意义重大,因为研究结果可以更好地决定在包括神经网络研究在内的各个研究领域中选择预处理类型。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号