首页> 外文期刊>Applied Artificial Intelligence >EXTENSIVE EVALUATION OF EFFICIENT NLP-DRIVEN TEXT CLASSIFICATION
【24h】

EXTENSIVE EVALUATION OF EFFICIENT NLP-DRIVEN TEXT CLASSIFICATION

机译:高效的NLP驱动文本分类的广泛评估

获取原文
获取原文并翻译 | 示例

摘要

Extensive experimental evidence is required to study the impact of text categorization approaches on real data and to assess the performance within operational scenarios. In this paper a wide set of profile-based classification models (a class of very efficient classifiers) sensitive to the syntactic information extracted from source texts is discussed. Several classifiers are tested, ranging from traditional approaches (e.g., variants of vector space, like SMART, or linear regression models) to original methods. All the experiments aim to evaluate some newly introduced feature weighting and inference models as well as to characterize the role of different linguistic information. The final purpose is thus to give an insight on the effective and efficient use of linguistic information for text categorization. The results suggest that an optimal exploitation of linguistic features can be obtained by a suitable selection among methods of feature weighting and inference. The empirical evidence collected in this paper over a wide range of corpora and languages is retained as a useful basis for the systematic design of operational statistical NLP-driven text classifiers.
机译:需要大量的实验证据来研究文本分类方法对真实数据的影响并评估操作场景下的性能。在本文中,讨论了对从源文本提取的句法信息敏感的各种基于概要文件的分类模型(一类非常有效的分类器)。从传统方法(例如向量空间的变体,例如SMART或线性回归模型)到原始方法,测试了几种分类器。所有实验旨在评估一些新引入的特征权重和推理模型,以及表征不同语言信息的作用。因此,最终目的是对有效和高效地使用语言信息进行文本分类提供一个见解。结果表明,可以通过在特征加权和推理方法之间进行适当选择来获得对语言特征的最佳利用。本文收集的各种语料库和语言的经验证据被保留作为系统设计可操作统计NLP驱动的文本分类器的有用基础。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号