The use of uttered Personal Identification Numbers (PIN) is a well-suited aproach for person identification through voice in real applications. In this paper, speaker verification with short 4-digit strings, in a pragmatic perspective where very few utterances for training are available, is accomplished. The problem here arises due to the small quantity of voice available in short PIN utternaces. Furthermore, it has to be taken into account the specificity of Spanish in this task, as digit strings are not uttered in a isolated digit-by-digit basis, but mentally grouped without constraints, and read as whole figures, with varying groups for different utterances of the same PIN. This specific factor induces high dependency on the phonetic contents of the PIN, and complicates considerably the design of text-dependent systems. Considering this, a text-independent GMM speaker verification system, including 'nearest reference speaker' and 'universal backgroun model' score normalization, together with CMN channel compensation, has been evalauted over a specific PIn database, where different training conditions (phonetic dependent/independent) are tested.
展开▼