【24h】

Semantic Pattern Tree Kernels for Short-Text Classification

机译:短文本分类的语义模式树内核

获取原文

摘要

Kernel methods are widely used for document classification in diverse domains. Popular kernels such as bag-of-word kernels and tree kernels show satisfactory results in classifying documents such as articles, e-mails or web pages. However, they provide less satisfactory performances in classifying short-text documents since the short documents have insufficient feature space. In order to cope with the problem, this paper presents a novel kernel function called semantic pattern tree kernel for classifying short-text documents. The proposed kernel extends the feature space of each document by incorporating syntactic and semantic information using three levels of semantic annotations. Experiments on the Open Directory Project dataset show that in classifying short-text documents the semantic pattern tree kernels achieve higher accuracy than the conventional kernels.
机译:内核方法已广泛用于不同领域的文档分类。流行的内核(例如词袋内核和树形内核)在对诸如文章,电子邮件或网页之类的文档进行分类时显示出令人满意的结果。但是,由于短文档没有足够的特征空间,它们在对短文档进行分类时不能提供令人满意的性能。为了解决这个问题,本文提出了一种新颖的内核功能,称为语义模式树内核,用于对短文本文档进行分类。所提出的内核通过使用三个级别的语义注释合并语法和语义信息来扩展每个文档的特征空间。对Open Directory Project数据集的实验表明,在对短文本文档进行分类时,语义模式树内核比常规内核具有更高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号