Provided is a training data generation device, etc., which generates training data for training an acoustic model that simulates robustness of human speech perception. The training data generation device includes a signal conversion unit that sets Q as an integer of 2 or higher and converts a first audio signal for training to second audio signals for training which are a (Q – 1) number of audio signals that have different perception strengths. At least the second audio signal for training that has the lowest perception strength, among the (Q – 1) second audio signals for training, is capable of generating auditory illusions.
展开▼