Privacy-Encoding Models for Preserving Utility of Machine Learning Algorithms in Social Media

机译：用于社交媒体机器学习算法效用的隐私编码模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Social media has become a vital platform in our daily life, where users can interact with their friends and other people throughout the world. The vast data generated by these platforms is unique in its variety and sensitivity, and although it potentially has significant utility, but also the potential for misuse. Although social media providers apply some existing privacy techniques, such as encryption and anonymization, the techniques cannot achieve a solid level of data privacy while maintaining the highest level of data utility. This paper proposes new Privacy-Encoding (PE) models that contain two-levels of data privacy: 1) data perturbation-based encoding techniques, and 2) data normalization-based scaling techniques. The data perturbation-based encoding techniques involve label encoder and one-hot encoder ones, while data normalization-based scaling techniques include min-max and z-score normalization ones. The aim of the two-levels is to transform original data into perturbed data, along with balancing the high level of data utility using machine learning algorithms. To evaluate the data utility, the proposed models are applied on the adult dataset as well as a simulated social media dataset and the accuracy of the results is compared with several machine learning algorithms. The experiment results reveal that the models could achieve high privacy and utility levels in terms of variance, accuracy and f-measure metrics.

机译：社交媒体已成为我们日常生活中的一个重要平台，用户可以与他们的朋友和世界各地的其他人互动。这些平台产生的庞大数据在其种类和灵敏度方面是独一无二的，尽管它可能具有重要的效用，但也具有滥用的可能性。虽然社交媒体提供者应用一些现有的隐私技术，例如加密和匿名化，但是在保持最高级别的数据实用程序的同时，技术无法实现数据隐私的实体级别。本文提出了新的隐私编码（PE）模型，其包含两级数据隐私：1）基于数据扰动的编码技术，以及基于数据归一化的缩放技术。基于数据的数据扰动的编码技术涉及标签编码器和一个热编码器，而基于数据归一化的缩放技术包括MIN-MAX和Z分数标准化。两级的目的是将原始数据转换为扰动数据，以及使用机器学习算法平衡高水平的数据实用程序。为了评估数据实用程序，所提出的模型应用于成人数据集以及模拟的社交媒体数据集，并将结果的准确性与几种机器学习算法进行比较。实验结果表明，该模型可以在方差，准确性和F测量指标方面实现高隐私和实用水平。

著录项

来源
《IEEE International Conference on Trust, Security and Privacy in Computing and Communications》|2020年|856-863|共8页
会议地点
作者
Sara Salim; Nour Moustafa; Benjamin Turnbull;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Data privacy; Privacy; Machine learning algorithms; Social networking (online); Data models; Encoding; Numerical models;

机译：数据隐私;隐私;机器学习算法;社交网络（在线）;数据模型;编码;数值模型;

相似文献

外文文献
中文文献
专利

1. Monitoring housing rental prices based on social media:An integrated approach of machine-learning algorithms and hedonic modeling to inform equitable housing policies [J] . Hu Lirong, He Shenjing, Han Zixuan, Land Use Policy . 2019,第期

机译：根据社交媒体监测住房租赁价格：一种机器学习算法和夏季建模的综合方法，可通知公平的住房政策
2. Sentiment Analysis Using Machine Learning Algorithms and Text Mining to Detect Symptoms of Mental Difficulties Over Social Media [J] . Hadj Ahmed Bouarara International journal of information systems and social change . 2021,第2期

机译：使用机器学习算法和文本挖掘来检测社交媒体精神困难症状的情绪分析
3. Machine learning algorithms for social media analysis: A survey [J] . Balaji T.K., Chandra Sekhara Rao Annavarapu, Annushree Bablani Computer science review . 2021,第May期

机译：社交媒体分析机器学习算法：调查
4. A Personalized Sensitive Label-Preserving Model and Algorithm Based on Utility in Social Network Data Publishing [C] . Yuqin Xie, Mingchun Zheng, Lin Liu International conference on human centered computing . 2016

机译：基于效用的社交网络数据发布个性化敏感标签保存模型和算法
5. Machine Learning Algorithms for the Analysis of Social Media and Detection of Malicious User Generated Content [D] . Heredia, Brian. 2018

机译：用于社交媒体分析和恶意用户生成内容检测的机器学习算法
6. How to Improve Compliance with Protective Health Measures during the COVID-19 Outbreak: Testing a Moderated Mediation Model and Machine Learning Algorithms [O] . Paolo Roma, Merylin Monaro, Laura Muzi, 2020

机译：如何在Covid-19爆发期间提高保护性健康措施的遵守情况：测试次要调解模型和机器学习算法
7. Models and algorithms of privacy-preserving machine learning [O] . Sergey V. Zapechnikov 2020

机译：保护机学习的模型与算法

Privacy-Encoding Models for Preserving Utility of Machine Learning Algorithms in Social Media

摘要

著录项

相似文献

相关主题

期刊订阅