A Framework for Unsupervised Natural Language Morphology Induction

机译：无监督的自然语言形态诱导框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many natural language processing tasks, including parsing and machine translation, frequently require a morphological analysis of the language(s) at hand. The task of a morphological analyzer is to identify the lexeme, citation form, or inflection class of surface word forms in a language. Striving to bypass the time consuming, labor intensive task of constructing a morphological analyzer by hand, unsupervised morphology induction techniques seek to automatically discover the morphological structure of a natural language through the analysis of corpora. This paper presents a framework for automatic natural language morphology induction inspired by the traditional and linguistic concept of inflection classes. Monson et al. (2004) uses the framework discussed in this paper and presents results using an intuitive baseline search strategy. This paper presents a discussion of the candidate inflection class framework as a generalization of corpus tries used in early work (Harris, 1955; Harris, 1967; Hafer and Weiss, 1974) and discusses an as yet unimplemented statistically motivated search strategy. This paper employs English to illustrate its main conjectures and a Spanish newswire corpus of 40,011 tokens and 6,975 types for concrete examples.

机译：许多自然语言处理任务，包括解析和机器翻译，经常需要对手语的形态分析。形态学分析仪的任务是以语言识别lexeme，引文或表面词形式的拐点。努力通过手工绕过耗时，劳动密集型任务，手工构建形态分析仪，通过对Corpora的分析，寻求自动发现自然语言的形态学结构。本文介绍了由传统和语言概念的自动自然语言形态学诱导的框架。蒙森等人。（2004）使用本文讨论的框架，并使用直观的基线搜索策略提出结果。本文讨论了候选人拐级框架，作为早期工作中使用的语料库尝试的概括（Harris，1955; Harris，1967; HARRIS和Weiss，1974），并讨论了一个尚未实现的统计上积极的搜索战略。本文采用英语来说明其主要猜想和西班牙新闻记语料库40,011令牌和6,975种类型的具体示例。

著录项

来源
《Association for Computational Linguistics Annual Meeting》|2004年||共6页
会议地点
作者
Christian Monson; Association for Computational Linguistics(ACL)(US);
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Unsupervised Learning of the Morphology of a Natural Language [J] . John Goldsmith Computational linguistics . 2001,第2期

机译：无监督学习自然语言的形态
2. Unsupervised learning of natural languages [J] . Solan Z, Horn D, Ruppin E, Proceedings of the National Academy of Sciences of the United States of America . 2005,第33期

机译：无监督学习自然语言
3. Unsupervised grammar induction of clinical report sublanguage [J] . Rohit J Kate Journal of Biomedical Semantics . 2012,第S3期

机译：临床报告亚语言的无监督语法归纳
4. A Framework for Unsupervised Natural Language Morphology Induction [C] . Christian Monson Proceedings of the Student Research Workshop, Interactive Posters/Demonstrations, and Tutorial Abstracts . 2004

机译：无监督自然语言形态归纳的框架
5. ParaMor: From paradigm structure to natural language morphology induction. [D] . Monson, Christian. 2008

机译：ParaMor：从范式结构到自然语言形态归纳。
6. Unsupervised learning of natural languages [O] . Zach Solan, David Horn, Eytan Ruppin, 2005

机译：无监督学习自然语言
7. A Framework for Unsupervised Natural Language Morphology Induction [O] . 2008

机译：无监督自然语言形态归纳的框架

A Framework for Unsupervised Natural Language Morphology Induction

摘要

著录项

相似文献

相关主题

期刊订阅