首页> 外文会议>International Conference on Development and Application Systems >A Comparative Study of Parametric Versus Non-Parametric Text Classification Algorithms
【24h】

A Comparative Study of Parametric Versus Non-Parametric Text Classification Algorithms

机译:参数与非参数文本分类算法的比较研究

获取原文

摘要

Evolution of modern technologies allowed to store the text in various digital formats such as e-mails, e-documents, libraries, etc. The amount of text data that is produced daily is increasing dramatically. Discovering useful patterns in text that can be represented in unstructured, semi-structured or structured format is a difficult task that requires a good understanding of machine learning algorithms. Finding a suitable algorithm for text mining tasks such as classification, clustering or natural language processing is a demanding situation that tests researchers’ abilities. This paper provides an overview of the text mining process also, presents a comparison of the performance and limitations of two predictive models generated using the parametric Naïve Bayes algorithm and nonparametric Deep Learning neural network. RapidMiner data science software platform has been used for models’ implementations and e-mail classification.
机译:现代技术的发展允许以各种数字格式存储文本,例如电子邮件,电子文档,图书馆等。每天产生的文本数据量急剧增加。在文本中发现可用非结构化,半结构化或结构化格式表示的有用模式是一项艰巨的任务,需要对机器学习算法有充分的了解。为诸如分类,聚类或自然语言处理之类的文本挖掘任务找到合适的算法,是测试研究人员能力的一种苛刻要求。本文还概述了文本挖掘过程,并对使用参数朴素贝叶斯算法和非参数深度学习神经网络生成的两个预测模型的性能和局限性进行了比较。 RapidMiner数据科学软件平台已用于模型的实现和电子邮件分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号