首页> 外国专利> ANALYTICS BASED ON SCALABLE HIERARCHICAL CATEGORIZATION OF WEB CONTENT

ANALYTICS BASED ON SCALABLE HIERARCHICAL CATEGORIZATION OF WEB CONTENT

机译:基于可伸缩的网页内容分类的层次分析法

摘要

Various methods and systems for performing analytics based on hierarchical categorization of content are provided. Analytics can be performed using an index building workflow and a classification workflow. In the index building workflow, documents are received and analyzed to extract features from the documents. Hierarchical category paths can be identified for the features. The documents are indexed to support searching the documents for the hierarchical category paths. In the classification workflow, a query, that includes or references content, may be received and analyzed to extract features from the content. The features are executed against a search engine that returns search result documents associated with hierarchical category paths. The hierarchical category paths from the search result documents may be used to generate a topic model of the content associated with the query. The topic model, used for web analytics, includes scores for the hierarchical category paths and for enumerated category topics.
机译:提供了用于基于内容的分级分类来执行分析的各种方法和系统。可以使用索引构建工作流和分类工作流执行分析。在索引构建工作流程中,接收并分析文档以从文档中提取特征。可以为要素标识分层类别路径。为文档建立索引以支持在文档中搜索层次结构类别路径。在分类工作流程中,可以接收和分析包含或引用内容的查询,以从内容中提取特征。针对搜索引擎执行功能,该搜索引擎返回与分层类别路径关联的搜索结果文档。来自搜索结果文档的分层类别路径可以用于生成与查询关联的内容的主题模型。用于网络分析的主题模型包括分层类别路径和枚举类别主题的分数。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号