首页> 外文期刊>International Journal of Computer Trends and Technology >An Efficient Classification Approach for the XML Documents
【24h】

An Efficient Classification Approach for the XML Documents

机译:XML文档的有效分类方法

获取原文
           

摘要

Extensible Markup Language (XML) has been used as standard format for a data representation over the internet. An XML document is usually organized by a set of textual data according to a predefined logical structure. Due to the presence of inherent structure in the XML documents, conventional text classification methods cannot be used to classify XML documents directly. In this paper, we propose the learning issues with XML documents from three major research areas. First, a knowledge representation method, which is based on typed higher order logic formalism. Here, the main focus is how to represent an XML document using higher order logic terms where both its contents and structures are captured. Secondsymbolic machine learning. Here, a new decisiontree learning algorithm determined by precision/recall breakeven point (PRDT) for the XML document classification problem. Precision/recall heuristic is considered in xml document classification is that the xml documents have strong connections with text documents. Finally, we had a semisupervised learning algorithm which is based on the PRDT algorithm and the cotraining framework. By producing comprehensible theories, the tentative results exhibit that our framework is capable to attain good performance in both the machine learning techniques.
机译:可扩展标记语言(XML)已用作Internet上数据表示的标准格式。 XML文档通常根据预定义的逻辑结构由一组文本数据来组织。由于XML文档中存在固有结构,因此常规的文本分类方法不能用于直接对XML文档进行分类。在本文中,我们提出了来自三个主要研究领域的XML文档的学习问题。首先,一种知识表示方法,该方法基于类型化的高阶逻辑形式主义。在这里,主要的重点是如何使用更高阶的逻辑术语表示XML文档,在其中捕获其内容和结构。第二符号机器学习。在这里,针对XML文档分类问题的一种由精度/召回收支平衡点(PRDT)确定的新决策树学习算法。 xml文档分类中考虑了精度/召回启发式,因为xml文档与文本文档有很强的联系。最后,我们有一个基于PRDT算法和协同训练框架的半监督学习算法。通过产生可理解的理论,初步结果表明,我们的框架能够在两种机器学习技术中均取得良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号