首页> 外文会议>2011 International Conference of Soft Computing and Pattern Recognition >An analysis of sentence level text classification for the Kannada language
【24h】

An analysis of sentence level text classification for the Kannada language

机译:卡纳达语的句子级文本分类分析

获取原文

摘要

With the rapid growth of internet, huge amount of data is available online. The ability to draw useful information from this digital data is quite challenging. The task of exploring and extracting information from native languages available on line is very much a useful task. The work presented here focuses on sentence level classification in the Kannada language. The most popular approaches in text categorization like Naïve Bayesian and Bag of Words (BOW) approaches are used in this work. It is evident that Bag of Words approach performs significantly better than Naïve Bayesian approach. The objective of the work is to find how sentence level classification works for Kannada Language, as it can be extended further to sentiment classification, Question Answering, Text Summarization and also for customer reviews in Kannada Blogs, because most user''s comments, queries, opinions etc are expressed using sentences, hence this sentence level Text Classification becomes a special task of Text Classification problem. The work though focuses on very basic approaches presently, can later be extended to other methods like SVM, KNN etc.
机译:随着互联网的快速发展,在线提供了大量数据。从此数字数据中提取有用信息的能力非常具有挑战性。从在线可用的本地语言中探索和提取信息的任务非常有用。这里介绍的工作着重于卡纳达语中的句子级别分类。这项工作使用了文本分类中最流行的方法,如朴素贝叶斯方法和单词袋(BOW)方法。显然,“言语袋”方法的性能要比朴素贝叶斯方法好得多。这项工作的目的是发现句子级分类对于卡纳达语是如何工作的,因为它可以扩展到情感分类,问答,文本摘要以及卡纳达语博客中的客户评论,因为大多数用户的评论,查询,意见等使用句子来表达,因此此句子级别的文本分类成为文本分类问题的一项特殊任务。尽管该工作目前集中在非常基本的方法上,但以后可以扩展到其他方法,例如SVM,KNN等。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号