【24h】

Ranking Text Documents Based on Conceptual Difficulty Using Term Embedding and Sequential Discourse Cohesion

机译:使用术语嵌入和顺序话语衔接对基于概念难度的文本文档进行排名

获取原文

摘要

We propose a novel framework for determining the conceptual difficulty of a domain-specific text document without using any external lexicon. Conceptual difficulty relates to finding the reading difficulty of domain-specific documents. Previous approaches to tackling domain-specific readability problem have heavily relied upon an external lexicon, which limits the scalability to other domains. Our model can be readily applied in domain-specific vertical search engines to re-rank documents according to their conceptual difficulty. We develop an unsupervised and principled approach for computing a term's conceptual difficulty in the latent space. Our approach also considers transitions between the segments generated in sequence. It performs better than the current state-of-the-art comparative methods.
机译:我们提出了一种小说框架,用于确定特定于域文本文档的概念难度而不使用任何外部词典。概念难度涉及找到特定于域的文件的阅读难度。以前的解决域特定的可读性问题的方法大量依赖于外部词汇,这将可扩展性限制在其他域中。我们的模型可以随时应用于特定于域的垂直搜索引擎,以根据其概念难度重新排名。我们制定了无监督和原则的方法,以计算阶段在潜在空间中的概念困难。我们的方法还考虑序列生成的段之间的转换。它表现优于当前的最先进的比较方法。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号