Affix-augmented stem-based language model for persian

机译：波斯语以词缀为基础的词缀增强词干模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Language modeling is used in many NLP applications like machine translation, POS tagging, speech recognition and information retrieval. It assigns a probability to a sequence of words. This task becomes a challenging problem for high inflectional languages. In this paper we investigate standard statistical language models on the Persian as an inflectional language. We propose two variations of morphological language models that rely on a morphological analyzer to manipulate the dataset before modeling. Then we discuss shortcoming of these models, and introduce a novel approach that exploits the structure of the language and produces more accurate. Experimental results are encouraging especially when we use n-gram models with small training dataset.

机译：语言建模已在许多NLP应用程序中使用，例如机器翻译，POS标记，语音识别和信息检索。它将概率分配给单词序列。对于高屈折度的语言，此任务成为具有挑战性的问题。在本文中，我们研究了波斯语作为一种屈折语言的标准统计语言模型。我们提出了两种形态语言模型的变体，它们依赖于形态分析器在建模之前操纵数据集。然后，我们讨论这些模型的缺点，并介绍一种利用语言结构并产生更准确结果的新颖方法。实验结果令人鼓舞，尤其是当我们将n-gram模型与较小的训练数据集一起使用时。

著录项

来源
《International Conference on Natural Language Processing and Knowledge Engineering》|2010年|P.1-4|共4页
会议地点
作者
Faili Heshaam; Ravanbakhsh Hadi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
Persian; Tracking; language model; morphological; n-gram;

机译：波斯语;跟踪;语言模型;形态学; n-gram;
入库时间 2022-08-26 14:43:29

相似文献

外文文献
中文文献
专利

1. Building Statistical Language Models for Persian Continuous Speech Recognition Systems Using the Peykare Corpus [J] . Mohammad Bahrani, Hossein Sameti International journal of computer processing of languages . 2011,第1期

机译：使用Peykare语料库为波斯语连续语音识别系统建立统计语言模型
2. Different kinds of embodied language: A comparison between Italian and Persian languages [J] . Ghandhari Mina, Fini Chiara, Da Rold Federico, Brain and cognition . 2020,第Jula期

机译：不同类型的体现语言：意大利和波斯语言之间的比较
3. Persian Language Dominance and the Loss of Minority Languages in Iran [J] . Hossein Ghanbari, Mahdi Rahimian Open Journal of Social Sciences . 2020,第11期

机译：波斯语主导和伊朗少数民族语言的损失
4. Affix-Augmented Stem-Based Language Model for Persian [C] . Heshaam FAILI, Hadi RAVANBAKHSH Proceedings of the 6th international conference on natural language processing and knowledge engineering. . 2010

机译：波斯语的基于词缀增强词干的语言模型
5. Occasioned Storytelling in Persian Language Classrooms [D] . ?Monfaredi, Elham 2019

机译：波斯语教室的讲故事讲故事
6. Evaluation of Central Auditory Processing of Azeri-Persian Bilinguals Using Dichotic Listening Tasks in First and Second Languages [O] . Jamileh FATTAHI, Ali Akbar TAHAEI, Hassan ASHAYERI, 2019

机译：使用第一语言和第二语言的双歧听任务评估阿塞拜疆-波斯双语者的中央听觉处理能力
7. Stem-based PoS tagging for agglutinative languages [O] . Necva Bolucu, Burcu Can 2017

机译：基于词条的POS标记用于凝集语言

Affix-augmented stem-based language model for persian

摘要

著录项

相似文献

相关主题

期刊订阅