首页> 外文会议>IEEE International Conference on Computer Systems and Applications >Hierarchical Approach to Select Feature Vectors for Classification of Text Documents
【24h】

Hierarchical Approach to Select Feature Vectors for Classification of Text Documents

机译:选择要素向量的分层方法,用于文本文档的分类

获取原文

摘要

Digital revolution that started over fifteen years ago is contributing to the exponential growth in text documents that show up in many forms such as web pages, emails, resumes, scientific reports, digital archives, etc. It is of great importance to develop techniques for automatic text document classification as a service to information consumers. Earlier text document classification techniques have used 'keyword-based' features and related statistics to achieve good results. More recently, some of these techniques have been extended to include 'phrase-based' and 'concept-based' features to achieve better results. Majority of these techniques utilize a very large number of features that are extracted from the training set of documents. We present a hierarchical method for selection of a fewer number of quality features to improve the classification efficiency.
机译:超过十五年前开始的数字革命是促进了文本文件中的指数增长,这些文本文件显示了许多形式,如网页,电子邮件,恢复,科学报告,数字档案等。它非常重视自动开发技术文本文档分类作为信息消费者的服务。早期的文本文档分类技术使用了“基于关键字的”功能和相关统计数据来实现良好的效果。最近,这些技术的一些已经扩展到包括“基于短语”和“基于概念的”功能来实现更好的结果。这些技术的大多数利用来自培训文件集中提取的大量功能。我们提出了一种选择少量质量特征来提高分类效率的分层方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号