首页> 外文会议>International conference on computational linguistics >Part of Speech Tagging for French Social Media Data
【24h】

Part of Speech Tagging for French Social Media Data

机译:法国社交媒体数据的语音标记的一部分

获取原文

摘要

In the context of Social Media Analytics, Natural Language Processing tools face new challenges on on-line conversational text, such as microblogs, chat, or text messages, because of the specificity of the language used in these channels. This work addresses the problem of Part-Of-Speech tagging (initially for French but also for English) on noisy language usage from the popular social media services like Twitter, Facebook and forums. We employ a linear-chain conditional random fields (CRFs) model, enriched with several morphological, orthographic, lexical and large-scale word clustering features. Our experiments used different feature configurations to train the model. We achieved a higher tagging performance with these features, compared to baseline results on French social media bank. Moreover, experiments on English social media content show that our model improves over previous works on these data.
机译:在社交媒体分析的上下文中,由于这些渠道中使用的语言的特殊性,自然语言处理工具在在线对话文本(例如微博,聊天或短信)方面面临着新的挑战。这项工作解决了流行的社交媒体服务(如Twitter,Facebook和论坛)上的嘈杂语言使用的词性标注(最初是法语,还有英语)的问题。我们采用线性链条件随机场(CRF)模型,该模型丰富了几种形态学,正字法,词法和大规模词聚类功能。我们的实验使用了不同的功能配置来训练模型。与法国社交媒体银行的基准结果相比,我们通过这些功能实现了更高的标记性能。此外,对英语社交媒体内容的实验表明,我们的模型比以前在这些数据上的工作有所改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号