【24h】

A comparative study of stemming algorithms for use with the Uzbek language

机译:与乌兹别克语语言一起使用的词干算法的比较研究

获取原文

摘要

Stemming is one of the pipeline feature of Information Retrieval and commonly used in natural language processing and text mining. The main purpose of a stemming process is to reduce the inflectional or derivational word into its root form. The difficulties on developing stemming algorithm is to identify and remove affixes since each language in the world has unique characteristics and grammatical rules. This paper compares related study on existing stemmers to be used in Uzbek language. We discuss the type of stemming algorithms, an overview of available popular English stemmers and comparison between discussed stemmers as well as their evaluation and analysis of available stemmers on Uzbek language experiment. Based on the comparative study and experiment, we proposal our model of the Uzbek stemmer that enhances some of the features in Lovins stemmer to suit the requirements for the Uzbek language.
机译:提取是信息检索的管道功能之一,通常用于自然语言处理和文本挖掘。词干处理的主要目的是将变形词或派生词简化为词根形式。由于世界上每种语言都有独特的特征和语法规则,因此开发词干算法的困难在于识别和删除词缀。本文比较了有关将在乌兹别克语中使用的现有词干的相关研究。我们讨论了词干算法的类型,可用的流行英语词干概述,讨论的词干之间的比较以及它们在乌兹别克语语言实验中对可用词干的评估和分析。根据比较研究和实验,我们提出了乌兹别克语词干分析器模型,该模型增强了Lovins词干分析器的某些功能,以适应乌兹别克语的要求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号