Sundanese Twitter Dataset for Emotion Classification

机译：Sundanese Twitter DataSet用于情感分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sundanese is the second-largest tribe in Indonesia which possesses many dialects. This condition has gained attention for many researchers to analyze emotion especially on social media. However, with barely available Sundanese dataset, this condition makes understanding sundanese emotion is a challenging task. In this research, we proposed a dataset for emotion classification of Sundanese text. The preprocessing includes case folding, stopwords removal, stemming, tokenizing, and text representation. Prior to classification, for the feature generation, we utilize term frequency-inverse document frequency (TFIDF). We evaluated our dataset using k-Fold Cross Validation. Our experiments with the proposed method exhibit an effective result for machine learning classification. Furthermore, as far as we know, this is the first Sundanese emotion dataset available for public.

机译：孙达尼斯是印度尼西亚第二大部落，拥有许多方言。这种情况对许多研究人员来说，尤其是在社交媒体上分析情感。然而，随着Sundanese DataSet勉强可用，这种情况使Sundanese情绪成为一个具有挑战性的任务。在这项研究中，我们提出了一个用于阳光文本的情感分类的数据集。预处理包括案例折叠，删除，止扰，令牌，令牌和文本表示。在分类之前，对于特征生成，我们利用术语频率反转文档频率（TFIDF）。我们使用k折叠交叉验证评估了我们的数据集。我们用所提出的方法的实验表现出机器学习分类的有效结果。此外，据我们所知，这是第一个为公众提供的阳光情绪数据集。

著录项

来源
《International Conference on Computer Engineering, Network, and Intelligent Multimedia》|2020年|391-395|共5页
会议地点
作者
Oddy Virgantara Putra; Fathin Muhammad Wasmanson; Triana Harmini; Shoffin Nahwa Utama;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Blogs; Social networking (online); Support vector machines; Feature extraction; Radio frequency; Task analysis; Informatics;

机译：博客;社交网络（在线）;支持向量机;特征提取;射频;任务分析;信息学;

相似文献

外文文献
中文文献
专利

1. Modified TF-Assoc Term Weighting Method for Text Classification on News Dataset from Twitter [J] . Imroatul Khuluqi Izzah, Abba Suganda Girsang IAENG Internaitonal journal of computer science . 2021,第1Pta2期

机译：Twitter新闻数据集文本分类的修改后的TF-assoce术语加权方法
2. Application of Support Vector Machine for Arabic Sentiment Classification Using Twitter-Based Dataset [J] . Journal of information & knowledge management . 2020,第1期

机译：支持向量机应用于使用推特式数据集的阿拉伯语情感分类的应用
3. Combining Linguistic, Semantic and Lexicon Feature for Emoji Classification in Twitter Dataset [J] . Rinda Wahyuni, Indra Budi Procedia Computer Science . 2018,第1期

机译：结合语言，语义和词典功能在Twitter数据集中进行表情符号分类
4. Emotion Classification on Indonesian Twitter Dataset [C] . Mei Silviana Saputri, Rahmad Mahendra, Mirna Adriani International conference on Asian language processing . 2018

机译：印度尼西亚Twitter数据集上的情绪分类
5. Investigation and Classification of Planetary Materials and Surfaces using Novel Methods to Analyze Large Compositional Datasets: Quantitative X-ray Compositional Mapping and Lunar Reconnaissance Orbiter Narrow Angle Camera Photometric Analysis [D] . Hahn, Timothy M., Jr. 2019

机译：用新型方法调查和分类，使用新型方法分析大型成分数据集：定量X射线成分映射和月球侦察轨道窄角度相机光度分析
6. Sentiment Contents and Retweets: A Study of Two Vaccine-Related Twitter Datasets [O] . Elizabeth B Blankenship, Mary Elizabeth Goff, Jinging Yin, 2018

机译：情绪内容和转推：两个与疫苗相关的Twitter数据集的研究
7. SWAT-CMW: Classification of Twitter Emotional Polarity using a Multiple-Classifier Decision Schema and Enhanced Emotion Tagging [O] . Riley Collins, Daniel May, Noah Weinthal, 2015

机译：SWAT-CMW：使用多分类机构决策模式和增强的情感标记分类Twitter情感极性

Sundanese Twitter Dataset for Emotion Classification

摘要

著录项

相似文献

相关主题

期刊订阅