首页> 外文会议>Web-age information management >XML Document Classification Using Closed Frequent Subtree
【24h】

XML Document Classification Using Closed Frequent Subtree

机译:使用封闭的频繁子树进行XML文档分类

获取原文
获取原文并翻译 | 示例

摘要

An efficient classification approach for XML documents is introduced in this paper, which lies in combining the content with the structure of XML documents to compute the similarity between the categories and documents. It is based on the Support Vector Machine (SVM) algorithm and the Structured Link Vector Model (SLVM) which used closed frequent subtrees as the structural units. The document tree pruning strategy was applied to improve the classification system while the link information between the documents was considered to get better classification results. We did experiments on the INEX XML mining data sets combining these techniques, and the results showed that our approach performs better than any other competitor's approach on XML classification.
机译:本文介绍了一种有效的XML文档分类方法,该方法是将内容与XML文档的结构相结合,以计算类别与文档之间的相似度。它基于支持向量机(SVM)算法和结构化链接向量模型(SLVM),后者使用封闭的频繁子树作为结构单元。应用文档树修剪策略来改进分类系统,同时考虑文档之间的链接信息以获得更好的分类结果。我们结合这些技术对INEX XML挖掘数据集进行了实验,结果表明,我们的方法在XML分类方面比任何其他竞争对手的方法都要好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号