首页> 外文期刊>Journal of biomedical informatics. >Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections.
【24h】

Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections.

机译:反射式随机索引和间接推断:一种发现隐式连接的可伸缩方法。

获取原文
获取原文并翻译 | 示例
           

摘要

The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously as methods for the discovery of such implicit connections. However, LSA in particular is dependent on a computationally demanding method of dimension reduction as a means to obtain meaningful indirect inference, limiting its ability to scale to large text corpora. In this paper, we evaluate the ability of Random Indexing (RI), a scalable distributional model of word associations, to draw meaningful implicit relationships between terms in general and biomedical language. Proponents of this method have achieved comparable performance to LSA on several cognitive tasks while using a simpler and less computationally demanding method of dimension reduction than LSA employs. In this paper, we demonstrate that the original implementation of RI is ineffective at inferring meaningful indirect connections, and evaluate Reflective Random Indexing (RRI), an iterative variant of the method that is better able to perform indirect inference. RRI is shown to lead to more clearly related indirect connections and to outperform existing RI implementations in the prediction of future direct co-occurrence in the MEDLINE corpus.
机译:在任何科学文献中都没有一起出现的术语之间的隐式联系的发现是Swanson最初提出的基于文献的知识发现模型的基础。先前已评估了语料库衍生的语义距离统计模型(例如潜在语义分析(LSA))作为发现此类隐式连接的方法。但是,LSA特别依赖于降维的计算要求较高的方法作为获得有意义的间接推理的一种手段,从而限制了它扩展为大文本语料库的能力。在本文中,我们评估了单词索引的可扩展分布模型随机索引(RI)在普通语言和生物医学语言之间绘制有意义的隐式关系的能力。支持该方法的人在某些认知任务上已达到了与LSA相当的性能,同时使用了比LSA更简单,对计算要求不高的降维方法。在本文中,我们证明了RI的原始实现在推断有意义的间接连接方面无效,并评估了反射随机索引(RRI),该方法是一种迭代式变体,可以更好地执行间接推理。在预测MEDLINE语料库中未来直接共现的过程中,RRI被证明可以导致更清晰地相关的间接连接,并且胜过现有的RI实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号