【24h】

Informal Lightweight Knowledge Extraction from Documents

机译:从文档中非正式地提取轻量级知识

获取原文

摘要

In this paper, we propose a method to automatically extract informal knowledge from a collection of documents. The method is mainly based on the definition of a kind of informal knowledge representation consisting of concepts (lexically indicated by words) and the links between them. We show that links can be inferred from documents through the use of the probabilistic topic model while the overall parameters optimisation procedure, based on a suitable score function, can be carried out through the Random Mutation Hill-Climbing algorithm. Experimental findings show that our method is effective and that, as side effects, the score function can be employed as a criterion to compute the homogeneity between documents, which can be considered as a prelude to a classification procedure.
机译:在本文中,我们提出了一种从文档集中自动提取非正式知识的方法。该方法主要基于一种非正式知识表示的定义,该非正式知识表示由概念(用词表示)和它们之间的链接组成。我们表明,可以通过使用概率主题模型从文档中推断出链接,而基于适当得分函数的总体参数优化过程可以通过随机变异爬山算法来实现。实验结果表明,我们的方法是有效的,并且作为副作用,可以将评分函数用作计算文档之间同质性的标准,这可以视为分类程序的前奏。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号