首页> 美国政府科技报告 >Form is the Substance: Classification of Genres in Text
【24h】

Form is the Substance: Classification of Genres in Text

机译:形式是物质:文本中的类型分类

获取原文

摘要

Categorization of text in IR has traditionally focused on topic. As use of the Internet and e-mail increases. categorization has become a key area of research as users demand methods of prioritizing documents. This work investigates text, classification by format style, i.e. 'genre',. and demonstrates. by complementing topic classification. that it can significantly improve retrieval of information. The paper compares use of presentation features to word features and the combination thereof, using Naive Bayes, C4.5 and SVM classifiers. Results show use of combined feature sets with SVM yields 92% classification accuracy in sorting seven genres.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号