首页> 外文会议>International conference on intelligent text processing and computational linguistics >A Computational Approach for Corpus Based Analysis of Reduplicated Words in Bengali
【24h】

A Computational Approach for Corpus Based Analysis of Reduplicated Words in Bengali

机译:基于语料库的孟加拉语重叠词分析计算方法

获取原文

摘要

Reduplication is an important phenomenon in language studies especially in Indian languages. The definition of reduplication is the repetition of the smallest linguistic unit partially or completely i.e. repetition of phoneme, morpheme, word, phrase, clause or the utterance as a whole and it gives different meaning in syntax as well as semantic level. The reduplicated words has important role in many natural language processing (NLP) applications, namely in machine translation (MT), text summarization, identification of multiword expressions, etc. This article focuses on an algorithm for identifying the reduplicated words from a text corpus and computing statistics (descriptive statistics) of reduplicated words frequently used in Bengali.
机译:在语言研究中,特别是在印度语言中,重复是一个重要现象。重复的定义是部分或全部重复最小的语言单元,即重复音素,词素,词,词组,从句或整体的话语,它在语法和语义层次上给出了不同的含义。重叠词在许多自然语言处理(NLP)应用程序中具有重要作用,即在机器翻译(MT),文本摘要,多词表达的标识等方面。本文着重于从文本语料库和文本语料库中识别重叠词的算法。计算孟加拉语中经常使用的重复单词的统计信息(描述性统计信息)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号