...
首页> 外文期刊>International Journal of Electrical and Computer Engineering >Towards optimize-ESA for text semantic similarity: A case study of biomedical text
【24h】

Towards optimize-ESA for text semantic similarity: A case study of biomedical text

机译:关于文本语义相似性的优化 - ESA:生物医学文本的案例研究

获取原文
           

摘要

Explicit Semantic Analysis (ESA) is an approach to measure the semantic relatedness between terms or documents based on similarities to documents of a references corpus usually Wikipedia. ESA usage has received tremendous attention in the field of natural language processing NLP and information retrieval. However, ESA utilizes a huge Wikipedia index matrix in its interpretation by multiplying a large matrix by a term vector to produce a high-dimensional vector. Consequently, the ESA process is too expensive in interpretation and similarity steps. Therefore, the efficiency of ESA will slow down because we lose a lot of time in unnecessary operations. This paper propose enhancements to ESA called optimize-ESA that reduce the dimension at the interpretation stage by computing the semantic similarity in a specific domain. The experimental results show clearly that our method correlates much better with human judgement than the full version ESA approach.
机译:显式语义分析(ESA)是一种测量基于参考语料库的文件的术语或文档之间的语义相关性的方法,通常是维基百科。 ESA使用率在自然语言处理NLP和信息检索领域得到了巨大的关注。然而,ESA通过将大矩阵乘以术语向量来利用巨大的维基百科指数矩阵,以产生高维向量。因此,ESA过程在解释和相似性步骤中过于昂贵。因此,ESA的效率将放缓,因为我们在不必要的操作中失去了大量时间。本文提出了通过计算特定域中的语义相似性来降低解释阶段的尺寸的ESA的增强。实验结果显然表明,我们的方法与人类判断相关得多比完整版ESA方法更好。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号