Building a large-scale testing dataset for conceptual semantic annotation of text

Xiao Wei; Daniel Dajun Zeng; Xiangfeng Luo; Wei Wu

首页> 外文期刊>International Journal of Computational Science and Engineering >Building a large-scale testing dataset for conceptual semantic annotation of text

【24h】

Building a large-scale testing dataset for conceptual semantic annotation of text

机译：构建一个大型测试数据集，用于文本的概念语义注释

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

>One major obstacle facing the research on semantic annotation is lack of large-scale testing datasets. In this paper, we develop a systematic approach to constructing such datasets. This approach is based on guided ontology auto-construction and annotation methods which use little priori domain knowledge and little user knowledge in documents. We demonstrate the efficacy of the proposed approach by developing a large-scale testing dataset using information available from MeSH and PubMed. The developed testing dataset consists of a large-scale ontology, a large-scale set of annotated documents, and the baselines to evaluate the target algorithm, which can be employed to evaluate both the ontology construction algorithms and semantic annotation algorithms.

机译：>面临着对语义注释的研究的一个主要障碍是缺乏大规模的测试数据集。在本文中，我们开发了构建此类数据集的系统方法。这种方法是基于引导本体的自动构建和注释方法，它在文档中使用了很少的先验域知识和小用户知识。我们通过使用网格和Pubmed的信息展示了所提出的方法来展示所提出的方法的功效。开发的测试数据集包括大规模的本体论，一个大规模的注释文件，以及评估目标算法的基线，可以采用来评估本体建设算法和语义注释算法。

著录项

来源
《International Journal of Computational Science and Engineering》 |2018年第1期|共10页
作者
Xiao Wei; Daniel Dajun Zeng; Xiangfeng Luo; Wei Wu;
展开▼
作者单位

Shanghai Institute of Technology Shanghai 201418 China;

State Key Laboratory of Management and Control for Complex Systems Institute of Automation Chinese Academy of Sciences Beijing 100190 China;

School of Computer Engineering and Science Shanghai University Shanghai 200444 China;

Shanghai Institute of Technology Shanghai 201418 China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
semantic annotation; ontology concept learning; testing dataset; evaluation baseline; ontology auto-construction; priori knowledge; evaluation parameters; guided annotation method; MeSH; PubMed;

机译：语义诠释;本体概念学习;测试数据集;评估基线;本体自动构建;先验知识;评估参数;引导注释方法;网格;啮合;斑驳;

相似文献

外文文献
中文文献
专利

1. Building a large-scale testing dataset for conceptual semantic annotation of text [J] . Xiao Wei, Daniel Dajun Zeng, Xiangfeng Luo, International Journal of Computational Science and Engineering . 2018,第1期

机译：构建一个大型测试数据集，用于文本的概念语义注释
2. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues [J] . Ernst Jason, Kellis Manolis Nature biotechnology . 2015,第4期

机译：表观基因组数据的大规模估算，用于系统注释各种人类组织
3. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues [J] . Ernst Jason, Kellis Manolis Nature biotechnology . 2015,第4期

机译：表观环元数据集的大规模归荷，用于各种人体组织的系统注释
4. RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling [C] . Jun Quan, Shian Zhang, Qian Cao, Conference on Empirical Methods in Natural Language Processing . 2020

机译：Risawoz：一个大型多域向导的DataSet，具有丰富的语义注释，用于面向任务的对话建模
5. Characterization of dependencies as building blocks for semantic relations using the Mutuo semantic capture framework and conceptual graphs [D] . Cox, Lisa 2008

机译：使用Mutuo语义捕获框架和概念图将依赖关系表征为语义关系的构建块
6. Hybrid semantic recommender system for chemical compounds in large-scale datasets [O] . Marcia Barros, Andre Moitinho, Francisco M. Couto 2021

机译：大型数据集中化学化合物的混合语义推荐系统
7. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task [O] . Tao Yu, Rui Zhang, Kai Yang, 2018

机译：蜘蛛：用于复杂和跨域语义解析和文本到SQL任务的大规模人员标记数据集

Building a large-scale testing dataset for conceptual semantic annotation of text

摘要

著录项

相似文献

相关主题

期刊订阅