Disclosed herein is a method, system, and apparatus comprising a computer program encoded on a computer storage medium for performing reflective performance regression minimization (CRF) for strategic search in strategic interactions between two or more parties. One of the methods includes the following: multiple regret samples-multiple regret samples obtained in two or more iterations of the CRF algorithm in a strategy search in strategic interaction between two or more parties-in a first data store. To store; Storing a number of strategic samples in a second data store; Updating a parameter of the first neural network to predict a regret value of a possible action in a party''s state based on a number of regret samples in the first data store; And updating the parameters of the second neural network to predict the strategic value of a possible action in the state of the party based on a number of strategy samples in the second data store.
展开▼