The present invention relates to the field of data processing, and discloses a deep reinforcement learning-based adaptive game algorithm, comprising the following steps: (A) acquiring policies for different degrees of cooperation; (B) generating policies for different degrees of cooperation; (C) detecting a cooperation policy of an opponent; and (D) making different coping policies. The technical effects of the present invention are as follows: trained detectors and policies for different degrees of cooperation are used to implement the existing concepts, such as Tit for tat, in sequential social dilemmas, improving the extensibility of the agent, and more intuitively acquiring competition policies superior to those already acquired.
展开▼