Natural Language Processing with Few Computational Linguistic Resources: An Experiment with Automatic Sentence Parsing for Amharic Texts

机译：具有很少计算语言资源的自然语言处理：阿姆哈拉语文本自动句法分析的实验

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The amount of work required to start from scratch in developing all aspects of natural language processing for a new language is huge. At the same time there is an urgent need for a variety of applications including local language spell-checkers, word processors, machine translation systems, search engines, etc. For these applications to be developed, the existence of computerized language resources and a well developed framework for research in this area is essential. Tree-banks, Part-of-speech taggers, computerized grammars, lexica, and parsers are all necessary parts of this framework. The study reported in this article describes an attempt to design and implement a prototype of an automatic sentence parser for Amharic text. Amharic is the official government language of Ethiopia and a language for which very few computational linguistic resources exist. To automatically parse sentences, the study used the Inside Outside algorithm with a bottom up chart parsing strategy. The probabilistic context free grammar was used as a grammatical formalism to represent the phrase structure rules of the language. A small sample corpus of 100 four-word sentences was selected from sentences in the language, and has been used to serve as a training and test set In spite of the limited amount of data and other resources available, the experiments show some promising results.

机译：从零开始为新语言开发自然语言处理的各个方面所需的工作量很大。同时，迫切需要各种应用程序，包括本地语言拼写检查器，文字处理器，机器翻译系统，搜索引擎等。要开发这些应用程序，必须具备计算机语言资源并且开发完善这方面的研究框架至关重要。树库，词性标记器，计算机语法，词法分析器和解析器都是该框架的必要组成部分。本文报道的研究描述了为Amharic文本设计和实现自动句子解析器原型的尝试。阿姆哈拉语是埃塞俄比亚的官方政府语言，也是一种很少有计算语言资源的语言。为了自动分析句子，该研究使用了内部外部算法和自底向上的图表分析策略。概率上下文无关语法被用作语法形式主义来表示语言的短语结构规则。从该语言的句子中选择了一个由100个四词句子组成的小样本语料库，该语料库已被用作训练和测试集。尽管可用的数据和其他资源数量有限，但实验显示了一些有希望的结果。

著录项

来源
《7th World Multiconference on Systemics, Cybernetics and Informatics(SCI 2003) vol.5: Computer Science and Engineering: I》|2003年|51-56|共6页
会议地点 OrlandoFL(US)
作者
Atelach ALEMU; Lars ASKER;
展开▼
作者单位

Department of Information Science, Addis Ababa University, PO Box 1176, Addis Ababa, Ethiopia;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
parsing; amharic; PCFG; the inside outside algorithm;

机译：解析阿姆哈拉语PCFG；内在外算法;

相似文献

外文文献
中文文献
专利

1. Automatic Amharic Text Summarization using NLP Parser [J] . Getahun Tadesse Mekuria, Aniket S. Jagtap International Journal of Engineering Trends and Technology . 2017,第1期

机译：使用NLP分析器自动进行Amharic文本汇总
2. MODERN STATISTICAL AND LINGUISTIC APPROACHES TO PROCESSING TEXTS IN NATURAL LANGUAGES [J] . ALEKSANDR EVGENJEVICH PETROV, DMITRII ALEKSANDROVICH SYTNIK Journal of Theoretical and Applied Information Technology . 2016,第2期

机译：自然语言处理文本的现代统计和语言方法
3. Neural network processing of natural language: II. Towards a unified model of corticostriatal function in learning sentence comprehension and non-linguistic sequencing [J] . Peter Ford Dominey, Toshio Inui, Michel Hoen Brain and language . 2009,第2a3期

机译：神经网络对自然语言的处理：II。建立一个学习句子理解和非语言排序的皮质口功能的统一模型
4. Natural Language Processing with Few Computational Linguistic Resources: An Experiment with Automatic Sentence Parsing for Amharic Texts [C] . Atelach ALEMU, Lars ASKER 7th World Multiconference on Systemics, Cybernetics and Informatics(SCI 2003) vol.5: Computer Science and Engineering: I . 2003

机译：具有很少计算语言资源的自然语言处理：阿姆哈拉语文本自动句法分析的实验
5. Any domain parsing: Automatic domain adaptation for natural language parsing. [D] . McClosky, David. 2010

机译：任何域解析：自动域适应自然语言解析。
6. Natural Language Processing and Automatic SNOMED-Encoding of Free Text: An Analysis of Free Text Data from a Routine Electronic Patient Record Application with a Parsing Tool Using the German SNOMED II [O] . Joerg H. Hohnloser, Matthias Holzer, Martin R.G. Fischer, 1996

机译：自然语言处理和自由文本的自动SNOMED编码：使用德语SNOMED II的解析工具对例行电子病历应用中的自由文本数据进行分析
7. Parsing and multi-word expressions. Towards linguistic precision and computational efficiency in natural language processing (PARSEME) [O] . Zdravev Zoran, Ulanska Tatjana, Kocaleva Mirjana 2013

机译：解析和多词表达。在自然语言处理方面迈向语言精确度和计算效率（paRsEmE）

Natural Language Processing with Few Computational Linguistic Resources: An Experiment with Automatic Sentence Parsing for Amharic Texts

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅