A computer implemented method for self-learning of a control system. The method includes creating an initial knowledge base. The method learns first principles using the knowledge base. The method creates initial control commands derived from the knowledge base. The method generates constraints for the control commands. The method performs constrained reinforcement learning by executing the control commands with the constraints and observing feedback to improve the control commands. The method enriches the knowledge base based on the feedback.
展开▼