首页> 外文会议>Signal Processing and Communication Application Conference >N-gram pattern recognition using multivariate-Bernoulli model with smoothing methods for text classification
【24h】

N-gram pattern recognition using multivariate-Bernoulli model with smoothing methods for text classification

机译:使用多元伯努利模型和平滑方法的N元语法模式识别文本分类

获取原文

摘要

In this paper, we mainly study on n-gram models on text classification domain. In order to measure impact of n-gram models on the classification performance, we carry out Na¿¿ve Bayes classifier with various smoothing methods. Na¿¿ve Bayes classifier has generally used two main event models for text classification which are Bernoulli and multinomial models. Researchers usually address multinomial model and Laplace smoothing on text classification and similar domains. The objective of this study is to demonstrate the classification performance of event models of Na¿¿ve Bayes by analyzing both event models with different smoothing methods and using n-gram models from a different perspective. In order to find various patterns between two event models, we carry on experiments a large Turkish dataset. Experiment results indicate that Bernoulli event model with an appropriate smoothing method can outperform on most of the n-gram models.
机译:在本文中,我们主要研究文本分类领域的n-gram模型。为了衡量n-gram模型对分类性能的影响,我们使用各种平滑方法进行了NaïveBayes分类器。朴素贝叶斯分类器通常将两个主要事件模型用于文本分类,分别是伯努利模型和多项式模型。研究人员通常会针对文本分类和相似域来处理多项式模型和Laplace平滑处理。这项研究的目的是通过分析具有不同平滑方法的两个事件模型以及从不同角度使用n-gram模型来演示NaïveBayes事件模型的分类性能。为了找到两个事件模型之间的各种模式,我们进行了一个大型土耳其数据集的实验。实验结果表明,采用适当的平滑方法的伯努利事件模型可以在大多数n-gram模型上胜过其表现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号