A system with deep reinforcement learning based control determines optimal actions for major components in a commercial building to minimize operation costs while maximizing comprehensive comfort levels of occupants. An unsupervised deep Q-network method is introduced to handle the energy management problem by evaluating the influence of operation costs on comfort levels considering the environment factors at each time slot. An optimum control decision can be derived that targets both immediate and long-term goals, where exploration and exploitation are considered simultaneously.
展开▼