首页> 外国专利> Large scale unsupervised hierarchical document categorization using ontological guidance

Large scale unsupervised hierarchical document categorization using ontological guidance

机译:本体指导下的大规模无监督分层文档分类

摘要

A classification method includes constructing queries from category descriptors representing categories of a taxonomy of hierarchically organized categories. The query constructed for a category c includes a query component based on descriptors of the category c and at least one query component based on descriptors of an ancestor or descendant category of the category c. A documents database is queried using the constructed queries to retrieve pseudo-relevant documents. Language models for the categories of the taxonomy are extracted from the pseudo-relevant documents by inferring a hierarchical topic model representing the taxonomy. An input document is classified by optimizing mixture weights of a weighted combination of categories of the hierarchical topic model respective to the input document.
机译:分类方法包括从类别描述符构造查询,该类别描述符表示分层组织的类别的分类法的类别。为类别c构造的查询包括基于类别c的描述符的查询组件和至少一个基于类别c的祖先或后代类别的描述符的查询组件。使用构造的查询来查询文档数据库以检索伪相关文档。通过推断代表分类法的分层主题模型,从伪相关文档中提取分类法类别的语言模型。通过优化与输入文档相对应的分层主题模型的类别的加权组合的混合权重来对输入文档进行分类。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号