首页> 外文会议>MICAI 2010;Mexican international conference on artificial intelligence >Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits
【24h】

Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits

机译:基于SentiUnits的基于词典的Urdu文本情感分析

获取原文

摘要

Like other languages, Urdu websites are becoming more popular, because the people prefer to share opinions and express sentiments in their own language. Sentiment analyzers developed for other well-studied languages, like English, are not workable for Urdu, due to their scriptic, morphological, and grammatical differences. As a result, this language should be studied as an independent problem domain. Our approach towards sentiment analysis is based on the identification and extraction of SentiUnits from the given text, using shallow parsing. SentiUnits are the expressions, which contain the sentiment information in a sentence. We use sentiment-annotated lexicon based approach. Unluckily, for Urdu language no such lexicon exists. So, a major part of this research consists in developing such a lexicon. Hence, this paper is presented as a base line for this colossal and complex task. Our goal is to highlight the linguistic (grammar and morphology) as well as technical aspects of this multidimensional research problem. The performance of the system is evaluated on multiple texts and the achieved results are quite satisfactory.
机译:像其他语言一样,乌尔都语网站也变得越来越流行,因为人们更喜欢用自己的语言来分享观点和表达情感。为其他经过良好研究的语言(例如英语)开发的情感分析器由于脚本,形态和语法上的差异而不适用于乌尔都语。因此,应将此语言作为独立的问题域进行研究。我们进行情感分析的方法是基于使用浅层解析从给定文本中识别和提取SentiUnit的方法。 SentiUnit是表达式,在句子中包含情感信息。我们使用基于情感注释的词典的方法。不幸的是,对于乌尔都语来说,没有这样的词典。因此,这项研究的主要部分在于开发这样的词典。因此,本文是作为这项艰巨而复杂的任务的基线而提出的。我们的目标是强调这个多维研究问题的语言学(语法和形态学)以及技术方面。该系统的性能在多篇文章中进行了评估,并且取得了令人满意的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号