首页> 外文期刊>ACM transactions on intelligent systems >DHPA: Dynamic Human Preference Analytics Framework: A Case Study on Taxi Drivers' Learning Curve Analysis
【24h】

DHPA: Dynamic Human Preference Analytics Framework: A Case Study on Taxi Drivers' Learning Curve Analysis

机译:DHPA:动态人类偏好分析框架:出租车驾驶员学习曲线分析的案例研究

获取原文
获取原文并翻译 | 示例
           

摘要

Many real-world human behaviors can be modeled and characterized as sequential decision-making processes, such as a taxi driver's choices of working regions and times. Each driver possesses unique preferences on the sequential choices over time and improves the driver's working efficiency. Understanding the dynamics of such preferences helps accelerate the learning process of taxi drivers. Prior works on taxi operation management mostly focus on finding optimal driving strategies or routes, lacking in-depth analysis on what the drivers learned during the process and how they affect the performance of the driver. In this work, we make the first attempt to establish Dynamic Human Preference Analytics. We inversely learn the taxi drivers' preferences from data and characterize the dynamics of such preferences over time. We extract two types of features (i.e., profile features and habit features) to model the decision space of drivers. Then through inverse reinforcement learning, we learn the preferences of drivers with respect to these features. The results illustrate that self-improving drivers tend to keep adjusting their preferences to habit features to increase their earning efficiency while keeping the preferences to profile features invariant. However, experienced drivers have stable preferences over time. The exploring drivers tend to randomly adjust the preferences over time.
机译:可以将许多现实世界中的人类行为建模并表征为顺序决策过程,例如出租车司机对工作区域和时间的选择。随着时间的推移,每个驾驶员对顺序选择都具有独特的偏好,并提高了驾驶员的工作效率。了解这种偏好的动态有助于加快出租车司机的学习过程。先前关于出租车运营管理的工作主要集中在寻找最佳的驾驶策略或路线,而缺乏对驾驶员在驾驶过程中学到的东西以及它们如何影响驾驶员性能的深入分析。在这项工作中,我们首次尝试建立动态人类偏好分析。我们从数据中反过来了解出租车司机的偏好,并刻画这些偏好随时间变化的动态。我们提取两种类型的特征(即个人档案特征和习惯特征)来对驾驶员的决策空间进行建模。然后,通过逆向强化学习,我们将了解驾驶员在这些功能方面的偏好。结果表明,自我完善的驾驶员倾向于不断调整其对习惯特征的偏好,以增加其赚钱效率,同时保持对特征特征的偏好不变。但是,有经验的驾驶员会随着时间的推移保持稳定的偏好。探索驱动程序倾向于随时间随机调整偏好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号