Polish evaluation dataset for compositional distributional semantics models

机译：波兰评估数据集用于组建分布语义模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The paper presents a procedure of building an evaluation dataset1. for the validation of compositional distributional semantics models estimated for languages other than English. The procedure generally builds on steps designed to assemble the SICK corpus, which contains pairs of English sentences annotated for semantic related-ness and entailment, because we aim at building a comparable dataset. However, the implementation of particular building steps significantly differs from the original SICK design assumptions, which is caused by both lack of necessary extraneous resources for an investigated language and the need for language-specific transformation rules. The designed procedure is verified on Polish, a fusional language with a relatively free word order, and contributes to building a Polish evaluation dataset. The resource consists of 10K sentence pairs which are human-annotated for semantic relatedness and entailment. The dataset may be used for the evaluation of compositional distributional semantics models of Polish.

机译：本文提出了构建评估数据集1的过程。对于估计英语以外的语言的组成分布语义模型的验证。该过程通常在旨在组装生病语料库的步骤中构建，其中包含用于语义相关的对的英语句子，因为我们的目标是构建可比的数据集。然而，特定构建步骤的实施与原始病假设计假设有很大不同，这是由缺乏对调查语言的必要外线资源引起的，以及对特定于语言的转型规则的必要性。设计过程在波兰语中验证，一种具有相对自由单词顺序的诡计语言，并有助于构建波兰评估数据集。该资源由10K句子对组成，用于人为注释，用于语义相关性和有关。数据集可用于评估抛光的组成分布语义模型。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2017年|lx p. 718-1425|共9页
会议地点
作者
Alina Wroblewska; Katarzyna Krasnowska-Kieras;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment [J] . Bentivogli Luisa, Bernardi Raffaella, Marelli Marco, Language Resources and Evaluation . 2016,第1期

机译：通过SemEval眼镜呼吸。通过语义相关性和文本涵义从完整句子的组成分布语义模型评估中吸取的教训
2. Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes [J] . Spiro P. Pantazatos, Jianrong Li, Paul Pavlidis, Cancer Informatics . 2009,第7期

机译：通过成像和非结构化表型的模型理论语义分解整合神经影像和微阵列数据集。
3. RELPRON: A Relative Clause Evaluation Data Set for Compositional Distributional Semantics [J] . Laura Rimell, Jean Maillard, Tamara Polajnar, Computational linguistics . 2016,第4期

机译：RELPRON：用于成分分布语义的相对子句评估数据集
4. Polish evaluation dataset for compositional distributional semantics models [C] . Alina Wroblewska, Katarzyna Krasnowska-Kieras Annual meeting of the Association for Computational Linguistics;Conference of the European Chapter of the Association for Computational Linguistics . 2017

机译：波兰语评估数据集，用于成分分布语义模型
5. Composite datasets facilitate large scale conservation planning: Application of a regional distribution model to protect an imperiled turtle [D] . Leu, Karen 2016

机译：复合数据集有助于大规模的保护计划：应用区域分布模型来保护濒危龟
6. Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes [O] . Spiro P. Pantazatos, Jianrong Li, Paul Pavlidis, 2009

机译：通过非结构化表型的映射和模型理论语义分解整合神经影像和微阵列数据集。
7. RELPRON: A Relative Clause Evaluation Dataset for Compositional Distributional Semantics [O] . Rimell Laura, Maillard Jean, Polajnar Tamara, 2016

机译：RELPRON：用于成分分布语义的相对子句评估数据集

Polish evaluation dataset for compositional distributional semantics models

摘要

著录项

相似文献

相关主题

期刊订阅