Evaluating distributed word representations for capturing semantics of biomedical concepts

机译：评估用于捕获生物医学概念的语义的分布式字表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently there is a surge in interest in learning vector representations of words using huge corpus in unsupervised manner. Such word vector representations, also known as word embedding, have been shown to improve the performance of machine learning models in several NLP tasks. However efficiency of such representation has not been systematically evaluated in biomedical domain. In this work our aim is to compare the performance of two state-of-the-art word embedding methods, namely word2vec and GloVe on a basic task of reflecting semantic similarity and relatedness of biomedical concepts. For this, vector representations of all unique words in the corpus of more than 1 million full-length research articles in biomedical domain are obtained from the two methods. We observe that parameters of these models do affect their ability to capture lexico-semantic properties and word2vec with particular language modeling seems to perform better than others.

机译：最近，在无监督的方式使用巨大的语料库，有兴趣的兴趣学习矢量表示。已经显示出这样的文字矢量表示，也称为Word嵌入，以提高几个NLP任务中机器学习模型的性能。然而，在生物医学域中尚未系统地评估这种代表的效率。在这项工作中，我们的目标是比较两个最先进的单词嵌入方法的性能，即Word2Vec和手套上反映生物医学概念的语义相似性和相关性的基本任务。为此，从两种方法获得了生物医学域中超过100万的全长研究文章的语料库中所有独特单词的矢量表示。我们观察到这些模型的参数确实影响了他们捕获词汇语义属性的能力，并且具有特定语言建模的Word2VEC似乎比其他语言建模更好。

著录项

来源
《Workshop on biomedical natural language processing》|2015年||共6页
会议地点
作者
Muneeb T H; Sunil Kumar Sahu; Ashish Anand;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Bag-of-concepts: Comprehending document representation through clustering words in distributed representation [J] . Kim Han Kyul, Kim Hyunjoong, Cho Sungzoon Neurocomputing . 2017,第nova29期

机译：概念袋：通过在分布式表示形式中聚类单词来理解文档表示形式
2. EEG decoding of spoken words in bilingual listeners: from words to language invariant semantic-conceptual representations [J] . Jo?￡o M. Correia, Bernadette Jansma, Lars Hausfeld, Frontiers in Psychology . 2015,第4期

机译：双语听众中口语单词的EEG解码：从单词到语言不变的语义概念表示
3. Semantically Readable Distributed Representation Learning and Its Expandability Using a Word Semantic Vector Dictionary [J] . Ikuo KESHI, Yu SUZUKI, Koichiro YOSHINO, IEICE transactions on information and systems . 2018,第4期

机译：使用词语义向量字典的语义可读分布式表示学习及其可扩展性
4. Evaluating distributed word representations for capturing semantics of biomedical concepts [C] . Muneeb T H, Sunil Kumar Sahu, Ashish Anand Workshop on biomedical natural language processing 2015 . 2015

机译：评估分布式词表示形式以捕获生物医学概念的语义
5. New semantic similarity techniques of concepts applied in the biomedical domain and WordNet. [D] . Nguyen, Hoa A. 2006

机译：在生物医学领域和WordNet中应用的概念的新语义相似性技术。
6. EEG decoding of spoken words in bilingual listeners: from words to language invariant semantic-conceptual representations [O] . João M. Correia, Bernadette Jansma, Lars Hausfeld, -1

机译：双语听众中口语的EEG解码：从单词到语言不变的语义概念表示
7. EEG decoding of spoken words in bilingual listeners: from words to language invariant semantic-conceptual representations [O] . 2015

机译：双语听众中口语的EEG解码：从单词到语言不变的语义概念表示

Evaluating distributed word representations for capturing semantics of biomedical concepts

摘要

著录项

相似文献

相关主题

期刊订阅