首页> 外文会议>International Conference of Artificial Intelligence and Information Technology >Word Embedding Comparison for Indonesian Language Sentiment Analysis
【24h】

Word Embedding Comparison for Indonesian Language Sentiment Analysis

机译:单词嵌入印度尼西亚语言情感分析的比较

获取原文

摘要

Development of information technology makes the production of data increase dramatically. We can get lots of data from the internet, including data reviews about a product or service. The more data obtained, the system is needed to process it. Sentiment analysis is a text processing of Natural Language Processing (NLP) that can help someone to see the quality of service offered, including hotel services. This paper uses hotel review data to carry out sentiment analysis obtained from the Traveloka website. The data classified using the Long Short-Term Memory (LSTM) algorithm. To get better results, the authors use word embedding to convert words into vectors. This study aims to compare the performance of several word embedding, while word embedding compared is word2vec Continuous Bag of Words CBOW, word2vec skip-gram, doc2vec, and glove. From the experiment conducted, the results show that the glove method has the highest accuracy of 95.52% while the word2vec skip-gram model has the lowest accuracy of 91.81%, so it concluded that the glove method is the best word embedding method for hotel review data.
机译:信息技术的开发使数据的生产急剧增加。我们可以从互联网上获取大量数据,包括关于产品或服务的数据审查。获得的数据越多,系统需要处理它。情感分析是自然语言处理的文本处理(NLP),可以帮助别人查看所提供的服务质量,包括酒店服务。本文使用酒店评价数据来实现从Traveloka网站获得的情感分析。使用长短短期内存(LSTM)算法分类的数据。为了获得更好的结果,作者使用Word Embedding将单词转换为向量。本研究旨在比较几个单词嵌入的性能,而嵌入比较的单词是Word2vec连续的单词Cow,Word2Vec Skip-Gram,Doc2Vec和手套。从实验开始,结果表明,手套方法的最高精度为95.52%,而Word2Vec Skip-Gram模型的精度最低为91.81%,所以它得出结论,手套方法是酒店评价的最佳词汇嵌入方法数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号