The possible application of time delay neural network (TDNN) to the text-dependent speaker verification problem is described and evaluated. Each person to be verified has a personalized neural network, which is trained to extract representative feature vector of the speaker by a particular utterance. A novel model called recurrent time delay neural networks is investigated. The training is carried out by backpropagation for sequence (BPS)-a variant of the BP algorithm. The modified structure is shown to outperform both a multilayer perceptron classifier and the original TDNN for feature extraction.
展开▼