Learning better while sending less: Communication-efficient online semi-supervised learning in client-server settings

机译：在发送更少的同时更好地学习：在客户端-服务器设置中进行高效通信的在线半监督学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider a novel distributed learning problem: A server receives potentially unlimited data from clients in a sequential manner, but only a small initial fraction of these data are labeled. Because communication bandwidth is expensive, each client is limited to sending the server only a small (high-priority) fraction of the unlabeled data it generates, and the server is limited in the amount of prioritization hints it sends back to the client. The goal is for the server to learn a good model of all the client data from the labeled and unlabeled data it receives. This setting is frequently encountered in real-world applications and has the characteristics of online, semi-supervised, and active learning. However, previous approaches are not designed for the client-server setting and do not hold the promise of reducing communication costs. We present a novel framework for solving this learning problem in an effective and communication-efficient manner. On the server side, our solution combines two diverse learners working collaboratively, yet in distinct roles, on the partially labeled data stream. A compact, online graph-based semi-supervised learner is used to predict labels for the unlabeled data arriving from the clients. Samples from this model are used as ongoing training for a linear classifier. On the client side, our solution prioritizes data based on an active-learning metric that favors instances that are close to the classifier's decision hyperplane and yet far from each other. To reduce communication, the server sends the classifier's weight-vector to the client only periodically. Experimental results on real-world data sets show that this particular combination of techniques outperforms other approaches, and in particular, often outperforms (communication expensive) approaches that send all the data to the server.

机译：我们考虑一个新颖的分布式学习问题：服务器以顺序的方式从客户端接收可能不受限制的数据，但是这些数据中只有一小部分被标记。由于通信带宽昂贵，因此每个客户端只能将其生成的未标记数据的一小部分（高优先级）发送给服务器，并且服务器的优先级提示数量也将受到限制，服务器会将其发送回客户端。服务器的目标是从服务器接收到的带标签和未带标签的数据中学习所有客户端数据的良好模型。此设置在现实世界的应用程序中经常遇到，并且具有在线，半监督和主动学习的特征。但是，以前的方法不是为客户端-服务器设置而设计的，并且没有降低通信成本的希望。我们提出了一种新颖的框架，以有效和高效沟通的方式解决这一学习问题。在服务器端，我们的解决方案结合了两个不同的学习者，他们在部分标记的数据流上协同工作，但角色不同。一个紧凑的，基于图的在线半监督学习器用于预测来自客户端的未标记数据的标记。该模型的样本用作线性分类器的持续训练。在客户端，我们的解决方案基于主动学习指标对数据进行优先级排序，该指标支持偏向于分类器决策超平面但彼此之间相距较远的实例。为了减少通信，服务器仅定期将分类器的权重向量发送给客户端。实际数据集上的实验结果表明，这种特定的技术组合胜过其他方法，尤其是经常优于将所有数据发送到服务器的方法（通信开销大）。

著录项

来源
《IEEE International Conference on Data Science and Advanced Analytics》|2015年|1-10|共10页
会议地点
作者
Xiao Han; Lin Shou-De; Yeh Mi-Yen; Gibbons Phillip B.; Eckert Claudia;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Bandwidth; Cameras; Data models; Distributed databases; Labeling; Semisupervised learning; Servers; big data; distributed system; online learning; semi-supervised learning;

机译：带宽;相机;数据模型;分布式数据库;标签;半监督学习;服务器;大数据;分布式系统;在线学习;半监督学习;

相似文献

外文文献
中文文献
专利

1. 基于随机权神经网络的在线自适应半监督学习算法及其在工业过程产品质量评价中的应用 [J] . 代伟, 胡金成, 程玉虎, 中南大学学报（英文版） . 2019,第012期
2. Online semi-supervised learning with learning vector quantization [J] . Shen Yuan-Yuan, Zhang Yan-Ming, Zhang Xu-Yao, Neurocomputing . 2020,第Jul25期

机译：在线半监督学习与学习矢量量化
3. Impact of Online Learning on the Variables, Students having Experience in Sending/Receiving E-Mail and E-Learning [J] . C. Anita International Journal of Engineering Studies . 2012,第3期

机译：在线学习对变量的影响，具有发送/接收电子邮件和电子学习经验的学生
4. The role of emotion in the learning process: Comparisons between online and face-to-face learning settings [J] . Gwen C. Marchand, Antonio P. Gutierrez The internet and higher education . 2012,第3期

机译：情感在学习过程中的作用：在线学习和面对面学习设置之间的比较
5. Learning better while sending less: Communication-efficient online semi-supervised learning in client-server settings [C] . Xiao Han, Lin Shou-De, Yeh Mi-Yen, IEEE International Conference on Data Science and Advanced Analytics . 2015

机译：在发送较少的同时学习更好：客户端 - 服务器设置中的通信高效的在线半监督学习
6. A submodular optimization framework for never-ending learning: Semi-supervised, online, and active learning. [D] . Emara, Wael. 2012

机译：用于永无止境学习的次模块优化框架：半监督，在线和主动学习。
7. SimNest: Social Media Nested Epidemic Simulation via Online Semi-supervised Deep Learning [O] . Liang Zhao, Jiangzhuo Chen, Feng Chen, -1

机译：SimNest：社交媒体通过在线半监督深度学习进行流行病模拟
8. Distillation-Based Semi-Supervised Federated Learning for Communication-Efficient Collaborative Training with Non-IID Private Data [O] . Sohei Itahara, Takayuki Nishio, Yusuke Koda, 2021

机译：基于蒸馏的半监督联合学习，用于非IID私人数据的通信有效的协作培训

Learning better while sending less: Communication-efficient online semi-supervised learning in client-server settings

摘要

著录项

相似文献

相关主题

期刊订阅