We learn a controller for a flat-footed bipedal robot to optimally respond to both (1) external disturbances caused by, for example, stepping on objects or being pushed, and (2) rapid acceleration, such as reversal of demanded walk direction. The reinforcement learning method employed learns an optimal policy by actuating the ankle joints to assert pressure at different points along the support foot, and to determine the next swing foot placement. The controller is learnt in simulation using an inverted pendulum model and the control policy transferred and tested on two small physical humanoid robots.
展开▼