首页> 外文会议>International workshop for computational linguistics of uralic languages >Extracting inflectional class assignment in Pite Saami Nouns, verbs and those pesky adjectives
【24h】

Extracting inflectional class assignment in Pite Saami Nouns, verbs and those pesky adjectives

机译:提取派生萨米名词,动词和讨厌形容词中的折衷班级分配

获取原文

摘要

The main goal of this paper is to describe to what extent the three main open word classes in Pite Saami (nouns, verbs and adjectives) can be automatically assigned to inflectional classes in language technology, specifically for a Finite State Transducer. For each of these word classes, the relevant structural features necessary for determining inflectional class membership are described. In this, a clear difference between the behavior of nouns and verbs, on the one hand, and that of adjectives, on the other hand, is ascertained. While morphophonology, as seen in the paradigmatic behavior of all three word classes, is complex and features a number of types of stem alternations, nouns and verbs are predictable, while adjectives are not. With this in mind, a basic algorithm for extracting inflectional class assignment for nouns and verbs is presented for use in a LEXC framework. In contrast to this, adjectives must be assigned to inflectional classes manually. The main TWOLC rules used to trigger morphophonological alternations are also outlined. The Pite Saami lexicographic database that forms the backbone for the LEXC stem files is managed using FileMaker Pro database software, and the workflow used to extract and update LEXC files from that database is described, focussing on the differences between nouns and verbs, and adjectives. In this, light is shed on how, on the one hand, nominal and verbal inflectional patters are highly complex yet reliably systematic, while adjective morphophonology is complex and unpredictable.
机译:本文的主要目标是描述爪子萨米(名词,动词和形容词)的三个主要开放词类在多大程度上可以自动分配给语言技术的折射类,专门针对有限状态传感器。对于这些单词类中的每一个,描述了确定interctional阶级成员所需的相关结构特征。这样,另一方面,名词和动词的行为与形容词的行为之间的明显差异是确定的。虽然在所有三个单词类的范式行为中所见的同类中是复杂的并且具有许多类型的词干,名词和动词是可预测的,而形容词则不是。考虑到这一点,提出了一种用于提取名词和动词的漂移类分配的基本算法,以用于lexc框架。与此相反,必须手动将形容词分配给拐点课程。用于触发语气阴离病学替代的主要TWOLC规则也概述。使用FileMaker Pro数据库软件管理lexc step文件的骨干骨干的派生Saami词典数据库,并描述了用于从该数据库中提取和更新Lexc文件的工作流程,重点关注名词和动词和形容词之间的差异。在这方面,阐述了如何,一方面,标称和言语折对水平是高度复杂而可靠的系统性的,而形容词的形态学是复杂和不可预测的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号