首页> 外文会议>International Conference on Advanced Computing and Applications >Method of Mapping Vietnamese Chunked Sentences to Definite Shallow Structures
【24h】

Method of Mapping Vietnamese Chunked Sentences to Definite Shallow Structures

机译:将越南语分词映射为确定的浅层结构的方法

获取原文
获取外文期刊封面目录资料

摘要

In many natural language processing based intelligent systems, parsing is the first task to perform. However, in the next stages, many systems often have the capacity of processing a limited number of parsed structures. The problem is to determine what parsed sentences can be recognized by a system. The decision of syntactic structures which can be processed by a system is consider as the task of "classification" of a parsed sentence into one of given classes of recognizable parses. In this paper we deal with this issue by proposing a method for mapping Vietnamese chunked sentences to a set of pre-defined shallow structures. Also, we tag lexicons and chunk phrases of the original sentences using our Functional Part-of-Speech (FPOS) tagset with Apache OpenNLP tools (Tokenizer, POS Tagger, Chunker). Based on the foundation of Functional Grammar, we define new lexical tags and combine with Penn-Treebank tagset to build our FPOS tagset. Due to our set of shallow structures is finite, instead of using a parser, we propose a rule-based algorithm for the mapping process. We establish conversion rules according to the reality experiences when using Vietnamese in common communication. The experiment shows that we converse successfully for the major of testing sentences and the algorithm can be applied for different languages.
机译:在许多基于自然语言处理的智能系统中,解析是要执行的第一个任务。但是,在接下来的阶段中,许多系统通常具有处理有限数量的已解析结构的能力。问题是确定系统可以识别哪些已解析的句子。可以由系统处理的句法结构的决定被视为将已解析的句子“分类”为给定可识别的解析器类别之一的任务。在本文中,我们通过提出一种将越南语块式句子映射到一组预定义浅层结构的方法来解决此问题。另外,我们使用功能性词性(FPOS)标签集和Apache OpenNLP工具(Tokenizer,POS Tagger,Chunker)来标记原始句子的词典和词组短语。在功能语法的基础上,我们定义了新的词汇标签,并与Penn-Treebank标签集结合以构建FPOS标签集。由于我们的浅层结构是有限的,因此不使用解析器,而是提出了一种基于规则的映射过程算法。在普通交流中使用越南语时,我们会根据实际经验来建立转换规则。实验表明,我们成功地完成了主要测试句子的会话,并且该算法可以适用于不同的语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号