首页> 外文期刊>IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans >A Clustering-Based Approach for Integrating Document-Category Hierarchies
【24h】

A Clustering-Based Approach for Integrating Document-Category Hierarchies

机译:基于聚类的文档类别层次结构集成方法

获取原文
获取原文并翻译 | 示例

摘要

E-commerce applications generate and consume a tremendous amount of online information, which is typically available as textual documents. Conceivably, organizations and individuals generally use category sets or hierarchies to organize, archive, and access their documents. Meanwhile, organizations and individuals constantly acquire relevant documents from various Internet sources, each of which may organize its documents in a category set or hierarchy different from that used by the acquiring organization or individual. Consequently, the integration of source documents organized in a category hierarchy into an existing category hierarchy deployed by the acquiring organization or individual becomes an important issue in the e-commerce era. Existing category-integration techniques are mainly designed to integrate document catalogs, each of which is organized nonhierarchically (i.e., in a flat set). In this paper, we propose a clustering-based category-hierarchy integration (CHI) technique, which is an extension of the clustering-based category-integration (CCI) technique. Our empirical evaluation results show that the proposed CHI technique appears to improve the effectiveness of category-hierarchy integration compared with that attained by nonhierarchical category-integration techniques, particularly in homogeneous and comparable scenarios.
机译:电子商务应用程序会生成和使用大量在线信息,这些信息通常可以作为文本文档获得。可以想象,组织和个人通常使用类别集或层次结构来组织,存档和访问其文档。同时,组织和个人不断从各种Internet来源获取相关文档,每种来源都可以按照与获取组织或个人所使用的类别或层次结构不同的类别集或层次来组织其文档。因此,将以类别层次结构组织的源文档集成到由收单组织或个人部署的现有类别层次结构中成为电子商务时代的重要问题。现有的类别集成技术主要用于集成文档目录,每个文档目录都是非分层组织的(即,在一个统一的集合中)。在本文中,我们提出了一种基于聚类的类别层次集成(CHI)技术,它是基于聚类的类别集成(CCI)技术的扩展。我们的经验评估结果表明,与非分层类别集成技术相比,所提出的CHI技术似乎提高了类别层次集成的有效性,尤其是在同类和可比较的情况下。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号