首页> 外国专利> UNSTRUCTURED DATA CLUSTERING OF INFORMATION TECHNOLOGY SERVICE DELIVERY ACTIONS

UNSTRUCTURED DATA CLUSTERING OF INFORMATION TECHNOLOGY SERVICE DELIVERY ACTIONS

机译:信息技术服务交付行动的非结构化数据集群

摘要

Systems, methods, and computer program products relating to clustering unstructured data. A set of unstructured documents is tokenized to produce a plurality of tokens. A frequency at which terms appear in the plurality of tokens is analyzed, to generate a vocabulary of terms. A vocabulary indices matrix is generated based on the generated vocabulary of terms. The matrix relates to the set of unstructured documents. A plurality of rows in the vocabulary indices matrix are matched to generate a plurality of clusters for the set of unstructured documents.
机译:与非结构化数据群集有关的系统,方法和计算机程序产品。一组非结构化文档被标记化以产生多个标记。分析术语在多个令牌中出现的频率,以生成术语词汇表。基于所生成的术语词汇来生成词汇索引矩阵。矩阵与非结构化文档集有关。词汇索引矩阵中的多行被匹配以为该组非结构化文档生成多个聚类。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号