首页> 外文会议>IEEE International Conference on Semantic Computing >Classification of Private Tweets Using Tweet Content
【24h】

Classification of Private Tweets Using Tweet Content

机译:使用推文内容分类私人推文

获取原文

摘要

Online social networks (OSNs) like Twitter provide an open platform for users to easily convey their thoughts and ideas from personal experiences to breaking news. With the increasing popularity of Twitter and the explosion of tweets, we have observed large amounts of potentially sensitive/private messages being published to OSNs inadvertently or voluntarily. The owners of these messages may become vulnerable to online stalkers or adversaries, and they often regret posting such messages. Therefore, identifying tweets that reveal private/sensitive information is critical for both the users and the service providers. However, the definition of sensitive information is subjective and different from person to person. To develop a privacy protection mechanism that is customizable to fit the needs of diverse audiences, it is essential to accurately and automatically classify potentially sensitive tweets. In this paper, we make the first attempt to classify private tweets into 14 categories, such as alcohol & drugs, family information, etc. We model tweet semantic with term distribution features as well as users' topic-preferences based on personal tweet history. Experiments show that our method can boost classification accuracy compared with the well-known Bag-of-Words and tf-idf methods.
机译:在线社交网络(OSN)如Twitter,为用户提供了一个开放的平台,以便在个人经验中轻松地传达他们的思想和想法。随着Twitter的普及和推文的爆炸,我们已经观察到大量潜在的敏感/私人消息被无意或自愿地发布给OSNS。这些消息的所有者可能变得容易受到在线追踪者或对手的影响,他们经常后悔发布此类信息。因此,识别揭示私有/敏感信息的推文对于用户和服务提供商至关重要。然而,敏感信息的定义是主观的,与人的相同。要开发隐私保护机制,可定制以满足不同受众的需求,必须准确和自动分类潜在敏感的推文。在本文中,我们首次尝试将私人推文分为14个类别,例如酒精和药物,家庭信息等。我们根据个人推文历史记录为推文分发功能以及用户的主题首选项。实验表明,与众所周知的单词和TF-IDF方法相比,我们的方法可以提高分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号