This paper presents a new method for automatic selection of optimal context from a database for an unseen phoneme sequence. If the context is not available for a test phoneme a novel formulation assigns a score to each of the training database phonemes in terms of their context. Normally, a decision tree is used for handling unseen phonemes in context[1,2]. However, this requires building a decision tree for each new language encountered. This may be problematic when developing multi-lingual speech processing systems. In addition, the tree structure may be quite different depending on the language. The proposed formulation incorporates a phoneme similarity matrix which is derived using an acoustic distance measure. This method is applied to selection of best units in a concatenative speech synthesis system, and encouraging results are obtained.
展开▼