首页> 外文会议>IEEE International Conference on Big Data >Anuj@IEEE BigData 2019: A Novel Code-Switching Behavior Analysis in Social Media Discussions Natural Language Processing

【24h】

Anuj@IEEE BigData 2019: A Novel Code-Switching Behavior Analysis in Social Media Discussions Natural Language Processing

机译：Anuj @ IEEE BigData 2019：社交媒体讨论中的新型代码转换行为分析自然语言处理

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With internet and social media breaking the ice, more and more people across the globe have started to use social media platforms like Facebook, Twitter, Instagram etc. Most people follow Multilingualism as a mode of communication to convey information across the globe. They share topics over the common forum to converse, with the use of multiple languages being spoken either by individual speaker or group of speakers. This essentially makes the context more complex to understand and it makes even more harder for processing various Natural Language Processing (NLP) tasks. Such user behavior of mixing multiple languages in one single discussion topic, having multiple community inclusion is referred as code-switching. At IEEE 2019 Big data conference, a Shared Task (Understanding Multilingual Communities through Analysis of code-switching Behaviors in Social Media Discussions) is conducted as a track of Big Data Cup. Firstly, Tasks is to detect the language of each post given in the discussion forum with the help of multiple languages. Secondly, to detect relevance score of a post by determining how much the content is closely connected or appropriate in the discussion. This paper proposes a novel approach to detect the language of each word in the post using Natural Language Processing (NLP) techniques involving linguistics, Python package(langdetect) and various other approaches. It also explains how Machine Learning is applied to figure out relevance of a post and other metrics required for prediction. Code-Mixing detection is an important step for any NLP application to determine the language of a post at first place in order to perform any NLP task over social media.

机译：随着互联网和社交媒体破冰而出，全球越来越多的人开始使用Facebook，Twitter，Instagram等社交媒体平台。大多数人都将多种语言作为一种传播方式，在全球范围内传播信息。他们在共同的论坛上共享话题以进行交谈，并且使用由单个发言人或一组发言人说的多种语言。这从本质上使上下文更难以理解，并且使处理各种自然语言处理（NLP）任务变得更加困难。这种在一个讨论主题中混合使用多种语言，包含多个社区的用户行为被称为代码切换。在IEEE 2019大数据会议上，作为大数据杯的赛道，开展了一项共享任务（通过分析社交媒体讨论中的代码交换行为来了解多语言社区）。首先，任务是在多种语言的帮助下检测讨论论坛中给出的每个帖子的语言。其次，通过确定讨论中内容紧密相关或合适的程度来检测帖子的相关性得分。本文提出了一种使用自然语言处理（NLP）技术检测帖子中每个单词的语言的新颖方法，该技术涉及语言学，Python软件包（langdetect）和其他各种方法。它还说明了如何将机器学习应用于找出帖子的相关性以及预测所需的其他指标。对于任何NLP应用程序来说，代码混合检测是重要的步骤，它首先要确定帖子的语言，以便通过社交媒体执行任何NLP任务。

著录项

来源
《IEEE International Conference on Big Data》|2019年|5957-5961|共5页
会议地点
作者
Anuj Saini;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Social network services; Natural language processing; Task analysis; Python; Libraries; Big Data; Machine learning;

机译：社交网络服务;自然语言处理;任务分析; Python;图书馆;大数据;机器学习;

相似文献

外文文献
中文文献
专利

1. Special Issue on Natural Language Processing for Social Media Analysis [J] . International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2020,第2期

机译：社交媒体分析自然语言处理特刊
2. Understanding Patient Anxieties in the Social Media Era: Qualitative Analysis and Natural Language Processing of an Online Male Infertility Community [J] . Vadim Osadchiy, Jesse Nelson Mills, Sriram Venkata Eleswarapu Journal of medical Internet research . 2020,第3期

机译：了解社交媒体时代的患者焦虑：在线男性不孕群落的定性分析和自然语言处理
3. Family History Extraction From Synthetic Clinical Narratives Using Natural Language Processing: Overview and Evaluation of a Challenge Data Set and Solutions for the 2019 National NLP Clinical Challenges (n2c2)/Open Health Natural Language Processing (OHNLP) Competition [J] . Feichen Shen, Sijia Liu, Sunyang Fu, JMIR Medical Informatics . 2021,第1期

机译：使用自然语言处理的综合临床叙事的家庭历史提取：概述和评估2019年国家NLP临床挑战（N2C2）/开放式健康自然语言处理（OHNLP）竞争的挑战数据集和解决方案
4. Anuj@IEEE BigData 2019: A Novel Code-Switching Behavior Analysis in Social Media Discussions Natural Language Processing [C] . Anuj Saini IEEE International Conference on Big Data . 2019

机译：ANUJ @ IEEE BIGDATA 2019：社交媒体中的一种新型代码切换行为分析讨论自然语言处理
5. Analyzing Domestic Abuse using Natural Language Processing on Social Media Data. [D] . Schrading, J. Nicolas. 2015

机译：使用自然语言处理社交媒体数据来分析家庭虐待。
6. Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media [O] . Christopher Andrew Bail 2016

机译：结合自然语言处理和网络分析来检查倡导组织如何激发社交媒体上的对话
7. اتجاهات مستخدمی مواقع التواصل الاجتماعی نحو جائحة کورونا (کوفید -19): تحلیل من المستوى الثانی لدراسات مدخل معالجة اللغة الطبیعیة Social media users' attitudes towards Corona Pandemic (Covid19): A secondary analysis of Natural language processing approach studies [O] . ریهام سامى 2021

机译：社交网站的运动朝着电晕流行病（Kofid-19）：分析了NPP社会用户态度对Corona大流行的态度的研究进程的第二级：自然语言处理方法研究的二次分析

Anuj@IEEE BigData 2019: A Novel Code-Switching Behavior Analysis in Social Media Discussions Natural Language Processing

摘要

著录项

相似文献

相关主题

期刊订阅