Part of Speech Tagging for French Social Media Data

机译：法国社交媒体数据的语音标记的一部分

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the context of Social Media Analytics, Natural Language Processing tools face new challenges on on-line conversational text, such as microblogs, chat, or text messages, because of the specificity of the language used in these channels. This work addresses the problem of Part-Of-Speech tagging (initially for French but also for English) on noisy language usage from the popular social media services like Twitter, Facebook and forums. We employ a linear-chain conditional random fields (CRFs) model, enriched with several morphological, orthographic, lexical and large-scale word clustering features. Our experiments used different feature configurations to train the model. We achieved a higher tagging performance with these features, compared to baseline results on French social media bank. Moreover, experiments on English social media content show that our model improves over previous works on these data.

机译：在社交媒体分析的上下文中，由于这些渠道中使用的语言的特殊性，自然语言处理工具在在线对话文本（例如微博，聊天或短信）方面面临着新的挑战。这项工作解决了流行的社交媒体服务（如Twitter，Facebook和论坛）上的嘈杂语言使用的词性标注（最初是法语，还有英语）的问题。我们采用线性链条件随机场（CRF）模型，该模型丰富了几种形态学，正字法，词法和大规模词聚类功能。我们的实验使用了不同的功能配置来训练模型。与法国社交媒体银行的基准结果相比，我们通过这些功能实现了更高的标记性能。此外，对英语社交媒体内容的实验表明，我们的模型比以前在这些数据上的工作有所改进。

著录项

来源
《International conference on computational linguistics》|2014年|1764-1772|共9页
会议地点
作者
Farhad Nooralahzadeh; Caroline Brun; Claude Roux;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Crime rate detection using social media of different crime locations and Twitter part-of-speech tagger with Brown clustering [J] . Thanh Vo, Sharma Rohit, Kumar Raghvendra, Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第4Pta2期

机译：犯罪率检测不同犯罪地点的社交媒体和Twitter分段与棕色聚类的言语标签
2. Projection of the corpus for the speech recognition service from standard French to French regional accents in field of French media [J] . Amel Mhamdi Journal of Telecommunications System & Management . 2020,第4期

机译：从标准法国到法国媒体领域的法国地区口音，语音识别服务的调查
3. Exploring the effect of air pollution on social activity in China using geotagged social media check-in data [J] . Yan Longxu, Duarte Fabio, Wang De, Cities . 2019,第AUGa期

机译：使用地理标记的社交媒体签到数据探索空气污染对中国社会活动的影响
4. Part of Speech Tagging for French Social Media Data [C] . Farhad Nooralahzadeh, Caroline Brun, Claude Roux International conference on computational linguistics . 2014

机译：法国社交媒体数据的语音标记的一部分
5. Developing a Data Mining Framework to Identify a Sense of Gentrification through Social Media Data: A Case Study Using Instagram Posts in Salt Lake City, Utah [D] . Huang, Cheng-Chia. 2017

机译：开发数据挖掘框架以通过社交媒体数据识别绅士主义感：以犹他州盐湖城的Instagram帖子为例的研究
6. End stage renal disease in French Guiana (data from R.E.I.N registry): South American or French? [O] . Dévi Rita Rochemont, Mohamed Meddeb, Raoul Roura, 2017

机译：法属圭亚那的终末期肾脏疾病（来自R.E.I.N注册机构的数据）：南美还是法国？
7. Survey of part-of-speech tagger for mixed-code Indian and foreign language used in social media [O] . Bhushan Ashokrao Nikam 2019

机译：用于社交媒体中使用的混合码印度和外语的言语态标签的调查

Part of Speech Tagging for French Social Media Data

摘要

著录项

相似文献

相关主题

期刊订阅