首页> 外文期刊>ACM SIGIR FORUM >Text Classification with Kernels on the Multinomial Manifold
【24h】

Text Classification with Kernels on the Multinomial Manifold

机译:多项式流形上带有核的文本分类

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Support Vector Machines (SVMs) have been very successful in text classification. However, the intrinsic geometric structure of text data has been ignored by standard kernels commonly used in SVMs. It is natural to assume that the documents are on the multinomial manifold, which is the simplex of multinomial models furnished with the Riemannian structure induced by the Fisher information metric. We prove that the Negative Geodesic Distance (NGD) on the multinomial manifold is conditionally positive definite (cpd), thus can be used as a kernel in SVMs. Experiments show the NGD kernel on the multinomial manifold to be effective for text classification, significantly outperforming standard kernels on the ambient Euclidean space.
机译:支持向量机(SVM)在文本分类方面非常成功。但是,文本数据的固有几何结构已被SVM中常用的标准内核所忽略。很自然地认为文档在多项式流形上,这是由Fisher信息度量导出的具有黎曼结构的多项式模型的单纯形。我们证明了多项式流形上的负测地距离(NGD)是条件正定的(cpd),因此可以用作SVM中的内核。实验表明,多项式流形上的NGD内核对于文本分类是有效的,大大优于环境欧几里得空间上的标准内核。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号