首页> 外文会议>International Conference on Computer and Information Technology >A Corpus-based evaluation of lexical components of a domain-specific text to Knowledge Mapping prototype
【24h】

A Corpus-based evaluation of lexical components of a domain-specific text to Knowledge Mapping prototype

机译:基于语料库的域文本文本的词汇组件的评估到知识映射原型

获取原文

摘要

The aim of this paper is to evaluate the lexical components of a Text to Knowledge Mapping (TKM) prototype. The prototype is domain-specific, the purpose of which is to map instructional text onto a knowledge domain. The context of the knowledge domain of the prototype is physics, specifically DC electrical circuits. During development, the prototype has been tested with a limited data set from the domain. The prototype now reached a stage where it needs to be evaluated with a representative linguistic data set called corpus. A corpus is a collection of text drawn from typical sources which can be used as a test data set to evaluate NLP systems. As there is no available corpus for the domain, we developed a representative corpus and annotated it with linguistic information. The evaluation of the prototype considers one of its two main components- lexical knowledge base. With the corpus, the evaluation enriches the lexical knowledge resources like vocabulary and grammar structure. This leads the prototype to parse a reasonable amount of sentences in the corpus.
机译:本文的目的是评估文本的词汇组件到知识映射(TKM)原型。原型是特定于域的,其目的是将教学文本映射到知识域中。原型的知识领域的上下文是物理学,特别是DC电路。在开发过程中,原型已通过从域中的有限数据进行测试。原型现在已达到一个阶段,需要使用名为语料库的代表语言数据集进行评估。语料库是从典型来源绘制的文本集合,可以用作测试数据集以评估NLP系统。由于该域没有可用的语料库,我们开发了代表语料库并用语言信息注释。对原型的评估考虑了其两个主要组成部分 - 词汇知识库之一。通过语料库,评估丰富了词汇和语法结构等词汇知识资源。这导致原型在语料库中解析合理数量的句子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号