Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System

Yoav Goldber; Michael Elhada

首页> 外文期刊>Computational linguistics >Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System

【24h】

Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System

机译：希伯来语解析系统中的分词，未知词解析和词法一致性

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a constituency parsing system for Modern Hebrew. The system is based on the PCFG-LA parsing method of Petrov et al. 2006, which is extended in various ways in order to accommodate the specificities of Hebrew as a morphologically rich language with a small treebank. We show that parsing performance can be enhanced by utilizing a language resource external to the treebank, specifically, a lexicon-based morphological analyzer. We present a computational model of interfacing the external lexicon and a treebank-based parser, also in the common case where the lexicon and the treebank follow different annotation schemes. We show that Hebrew word-segmentation and constituency-parsing can be performed jointly using CKY lattice parsing. Performing the tasks jointly is effective, and substantially outperforms a pipeline-based model. We suggest modeling grammatical agreement in a constituency-based parser as a filter mechanism that is orthogonal to the grammar, and present a concrete implementation of the method. Although the constituency parser does not make many agreement mistakes to begin with, the filter mechanism is effective in fixing the agreement mistakes that the parser does make.These contributions extend outside of the scope of Hebrew processing, and are of general applicability to the NLP community. Hebrew is a specific case of a morphologically rich language, and ideas presented in this work are useful also for processing other languages, including English. The lattice-based parsing methodology is useful in any case where the input is uncertain. Extending the lexical coverage of a treebank-derived parser using an external lexicon is relevant for any language with a small treebank.

机译：我们为现代希伯来语提供了一个选区分析系统。该系统基于Petrov等人的PCFG-LA解析方法。 2006年，它以各种方式扩展，以适应希伯来语的特殊性，希伯来语是一种形态丰富的语言，带有一个小的树库。我们显示可以通过利用树库外部的语言资源（特别是基于词典的形态分析器）来提高解析性能。我们提供了一个外部模型与基于树库的解析器接口的计算模型，在通常情况下，字典和树库遵循不同的注释方案。我们显示希伯来语单词分割和选区解析可以使用CKY格点解析一起执行。联合执行任务是有效的，并且大大优于基于管道的模型。我们建议在基于选区的解析器中将语法协议建模为与语法正交的过滤器机制，并提出该方法的具体实现。尽管选区解析器一开始并不会犯很多协议错误，但是过滤器机制可以有效地解决解析器确实犯的协议错误。这些贡献超出了希伯来语处理的范围，并且对NLP社区具有普遍适用性。希伯来语是一种形态丰富的语言的特例，此作品中提出的想法对于处理其他语言（包括英语）也很有用。在不确定输入的任何情况下，基于格的解析方法都非常有用。使用外部词典扩展树库派生的解析器的词法覆盖范围与具有小树库的任何语言都相关。

著录项

来源
《Computational linguistics》 |2013年第1期|共40页
作者
Yoav Goldber; Michael Elhada;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
入库时间 2022-08-18 06:58:26

相似文献

外文文献
中文文献
专利

1. Orthographic Transparency Enhances Morphological Segmentation in Children Reading Hebrew Words [J] . Laurice Haddad, Yael Weiss, Tami Katzir, Frontiers in Psychology . 2017,第4期

机译：正字法透明度提高了阅读希伯来语单词的儿童的形态学分段
2. Correlation between phonological and morphological awareness and the reading of punctuated and non-punctuated words in Arabic as first language and Hebrew as second language [J] . Saied Bishara, Itzhak Weiss Cogent Education . 2017,第1期

机译：语音和形态意识与阿拉伯语为第一语言，希伯来语为第二语言的标点符号和非标点单词的阅读之间的相关性
3. The Separability of Morphological Processes from Semantic Meaning and Syntactic Class in Production of Single Words: Evidence from the Hebrew Root Morpheme [J] . Deutsch Avital Journal of psycholinguistic research . 2016,第1期

机译：单个词产生中形态过程与语义和句法类的可分离性：来自希伯来语根词素的证据
4. Parsing-based Chinese word segmentation integrating morphological and syntactic information [C] . Wu Xihong, Zhang Meng, Lin Xiaojun 7th International Conference on Natural Language Processing and Knowledge Engineering . 2011

机译：结合形态和句法信息的基于解析的中文分词
5. A comparison of dispute resolution systems in the General Agreement of Tariffs and Trade, the Canada-United States Free Trade Agreement, and the North American Free Trade Agreement. [D] . Little, John Stuart. 1994

机译：《关税与贸易总协定》，《加拿大-美国自由贸易协定》和《北美自由贸易协定》中争端解决系统的比较。
6. Orthographic Transparency Enhances Morphological Segmentation in Children Reading Hebrew Words [O] . Laurice Haddad, Yael Weiss, Tami Katzir, -1

机译：正字法透明度提高了阅读希伯来语单词的儿童的形态学分段
7. Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System [O] . Yoav Goldberg, Michael Elhadad 2013

机译：希伯来语分析系统中的分词，未知词解析和词法一致性

Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System

摘要

著录项

相似文献

相关主题

期刊订阅