首页> 外文期刊>International Journal of Computers & Applications >Constructing automatic domain-specific sentiment lexicon using KNN search via terms discrimination vectors
【24h】

Constructing automatic domain-specific sentiment lexicon using KNN search via terms discrimination vectors

机译:使用KNN搜索通过术语区分向量构建自动的特定领域情感词典

获取原文
获取原文并翻译 | 示例
       

摘要

Web textual data content is a viable source for decision-makers' knowledge, so are text analytic applications. Sentiment analysis (SA) is one of text mining fields, in which text is analyzed to recognize text writer implied opinion. In this paper, a new approach had been presented for automatic Arabic language sentiment lexicon constructing. Popular KNN search algorithm is utilized for this objective. Cosine distance between seeds terms and corpus terms is employed in KNN search query. Generated lexicon terms are launched from sentiment seeds and seeds terms are augmented via Arabic-specific NLP-based algorithm, which is helped to enhance seeds terms selection process.Term discrimination vector (TDV) is the main part of KNN query inputs TDV components are computed for each corpus term and it is constituted by four term weight techniques. According to the experimental results, TDV accomplished better results than TF-IDF traditional method with lower computation cost. Also, constructed lexicons outperformed premade lexicons accuracy results.
机译:Web文本数据内容是决策者知识的可行来源,文本分析应用程序也是如此。情感分析(SA)是文本挖掘领域之一,其中对文本进行分析以识别文本作者的隐含观点。在本文中,提出了一种自动构建阿拉伯语情感词典的新方法。流行的KNN搜索算法用于此目标。在KNN搜索查询中采用了种子词和语料词之间的余弦距离。从情感种子中生成生成的词典术语,并通过基于阿拉伯语的基于NLP的特定算法来增强种子术语,这有助于增强种子术语的选择过程。术语区分向量(TDV)是KNN查询输入的主要部分,计算了TDV分量对于每个语料库术语,它由四种术语权重技术构成。根据实验结果,TDV比TF-IDF传统方法取得了更好的结果,计算成本更低。同样,构造的词典要优于预制词典的准确性结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号