首页> 外文会议>Pacific Rim international conference on artificial intelligence >Toward Reciprocity-Aware Distributed Learning in Referral Networks
【24h】

Toward Reciprocity-Aware Distributed Learning in Referral Networks

机译:转向网络中的互惠感知分布式学习

获取原文

摘要

Distributed learning in expert referral networks is an emerging challenge in the intersection of Active Learning and Multi-Agent Reinforcement Learning, where experts—humans or automated agents— have varying skills across different topics and can redirect difficult problem instances to connected colleagues with more appropriate expertise. The learning-to-refer challenge involves estimating colleagues' topic-conditioned skills for appropriate referrals. Prior research has investigated different reinforcement learning algorithms both with uninforma-tive priors and partially available (potentially noisy) priors. However, most human experts expect mutually-rewarding referrals, with return referrals on their expertise areas so that both (or all) parties benefit from networking, rather than one-sided referral flow. This paper analyzes the extent of referral reciprocity imbalance present in high-performance referral-learning algorithms, specifically multi-armed bandit (MAB) methods belonging to two broad categories - frequentist and Bayesian - and demonstrate that both algorithms suffer considerably from reciprocity imbalance. The paper proposes modifications to enable distributed learning methods to better balance referral reciprocity and thus make referral networks win-win for all parties. Extensive empirical evaluations demonstrate substantial improvement in mitigating reciprocity imbalance, while maintaining reasonably high overall solution performance.
机译:专家推荐网络的分布式学习是积极学习和多智能经纪增强学习的新出现挑战,在那里专家 - 人类或自动化代理 - 在不同主题上具有不同的技能,可以将困难的问题实例重定向,以更合适的专业知识。学习 - 参考挑战涉及估算同事的主题条件技能,以适当的推荐。现有研究已经研究了不形式的加强前沿和部分可用的不同强化学习算法,并部分可用(潜在的嘈杂)。然而,大多数人类专家期待相互回报的推荐,并在其专业领域的返回转介,以便(或全部)缔约方都受益于网络,而不是单面推荐流程。本文分析了高性能转诊学习算法中存在的推荐互惠性不平衡的程度,特别是属于两种广义类别的多武装强盗(MAB)方法 - 频繁的频率和贝叶斯 - 并且证明这两种算法两种算法遭受了很大的侵害。本文提出了修改,使分布式学习方法能够更好地平衡推荐互惠,从而使推荐网络为所有各方获得双赢。广泛的经验评估表现出显着提高减轻互惠性失衡,同时保持合理的总体溶液性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号