首页> 外国专利> System and method for providing robust topic identification in social indexes

System and method for providing robust topic identification in social indexes

机译:在社交索引中提供可靠主题识别的系统和方法

摘要

A computer-implemented method for providing robust topic identification in social indexes is described. Electronically-stored articles and one or more indexes are maintained. Each index includes topics that each relate to one or more of the articles. A random sampling and a selective sampling of the articles are both selected. For each topic, characteristic words included in the articles in each of the random sampling and the selective sampling are identified. Frequencies of occurrence of the characteristic words in each of the random sampling and the selective sampling are determined. A ratio of the frequencies of occurrence for the characteristic words included in the random sampling and the selective sampling is identified. Finally, for each topic, a coarse-grained topic model is built, which includes the characteristic words included in the articles relating to the topic and scores assigned to those characteristic words.
机译:描述了一种用于在社交索引中提供鲁棒的主题识别的计算机实现的方法。电子存储的文章和一个或多个索引得到维护。每个索引都包含与一个或多个文章相关的主题。选择物品的随机抽样和选择性抽样。对于每个主题,标识随机抽样和选择性抽样中的文章中包含的特征词。确定在随机采样和选择性采样中的每一个中特征词的出现频率。确定随机采样和选择采样中包括的特征词的出现频率的比率。最后,针对每个主题,构建一个粗粒度的主题模型,该模型包括与主题相关的文章中包含的特征词以及分配给这些特征词的分数。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号