...
首页> 外文期刊>Future generation computer systems >On the impact of knowledge-based linguistic annotations in the quality of scientific embeddings
【24h】

On the impact of knowledge-based linguistic annotations in the quality of scientific embeddings

机译:论知识的语言注解在科学嵌入品质中的影响

获取原文
获取原文并翻译 | 示例

摘要

In essence, embedding algorithms work by optimizing the distance between a word and its usual context in order to generate an embedding space that encodes the distributional representation of words. In addition to single words or word pieces, other features which result from the linguistic analysis of text, including lexical, grammatical and semantic information, can be used to improve the quality of embedding spaces. However, until now we did not have a precise understanding of the impact that such individual annotations and their possible combinations may have in the quality of the embeddings. In this paper, we conduct a comprehensive study on the use of explicit linguistic annotations to generate embeddings from a scientific corpus and quantify their impact in the resulting representations. Our results show how the effect of such annotations in the embeddings varies depending on the evaluation task. In general, we observe that learning embeddings using linguistic annotations contributes to achieve better evaluation results.
机译:实质上,通过优化单词和通常上下文之间的距离来嵌入算法,以便生成编码单词分配表示的嵌入空间。除了单词或单词块之外,来自文本语言分析的其他特征,包括词法,语法和语义信息,可用于提高嵌入空间的质量。然而,直到现在,我们没有精确了解这种个人注释和可能的组合可能具有嵌入质量的影响。在本文中,我们对使用明确语言注释进行了全面的研究,从科学语料库中生成嵌入,并量化它们对所产生的陈述的影响。我们的结果表明,这种注释在嵌入中的效果如何因评估任务而异。通常,我们观察到使用语言注释的学习嵌入有助于实现更好的评估结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号