首页> 外国专利> Text information clustering method and text information clustering system

Text information clustering method and text information clustering system

机译:文本信息聚类方法和文本信息聚类系统

摘要

One embodiment of the present application discloses a text information clustering method and system. The clustering method comprises the steps of performing word segmentation for each of a plurality of textual information, and initial clustering for a plurality of textual information for which word segmentation has been performed to form a plurality of first level topics. The first level topics each including at least two text information, and based on the number of text information below each of the first level topics according to a preset rule Determining the number of second level topics under each of the first level topics, and forming the plurality of second level topics, according to the number of second level topics under each of the first level topics Second class for at least two pieces of text information contained in each one-level topic And performing a reduction. In the present application, in the initial clustering, the hierarchical clustering method is used to reduce the total number of first level topics, thereby improving the computational efficiency and in the second clustering, the second level topics The number of is dynamically determined according to the number of text information, which accelerates the operation speed of the second level topic.
机译:本申请的一个实施例公开了一种文本信息聚类方法和系统。聚类方法包括以下步骤:对多个文本信息中的每一个执行词分割,以及对已经对其进行词分割以形成多个第一级主题的多个文本信息的初始聚类。所述第一级别主题均包括至少两个文本信息,并且根据预设规则基于每个所述第一级别主题下方的文本信息的数量,确定每个所述第一级别主题下的第二级别主题的数量,并形成多个第二级主题,根据每个第一级主题下的第二级主题的数量,对每个第二级主题中包含的至少两条文本信息进行第二级还原。在本申请中,在初始聚类中,采用层次聚类的方法减少了第一级主题的总数,从而提高了计算效率;在第二聚类中,第二级主题的数量是根据文本信息的数量,从而加快了第二级主题的操作速度。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号