首页> 外国专利> Taxonomy generation for document collections

Taxonomy generation for document collections

机译:用于文档收集的分类法生成

摘要

This mechanism relates to a method within the area of information mining within a multitude of documents stored on computer systems. More particularly, this mechanism relates to a computerized method of generating a content taxonomy of a multitude of electronic documents. The technique proposed by the current invention is able to improve at the same time the scalability and the coherence and selectivity of taxonomy generation. The fundamental approach of the current invention comprises a subset selection step, wherein a subset of a multitude of documents is being selected. In a taxonomy generation step a taxonomy is generated for that selected subset of documents, the taxonomy being a tree structured taxonomy hierarchy. Moreover this method comprises a routing selection step assigning each unprocessed document to the taxonomy hierarchy based on largest similarity.
机译:该机制涉及在计算机系统上存储的大量文档中的信息挖掘领域内的方法。更具体地,该机制涉及一种生成多个电子文档的内容分类法的计算机化方法。本发明提出的技术能够同时改善分类法生成的可扩展性以及相干性和选择性。本发明的基本方法包括子集选择步骤,其中选择多个文档的子集。在分类法生成步骤中,为所选的文档子集生成分类法,该分类法是树状结构的分类法层次结构。此外,该方法包括路由选择步骤,该步骤基于最大相似度将每个未处理的文档分配给分类体系。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号