音声認識技術は現在,様々な環境下や場面において使用される機会が増加している.しかし,言語障害者などの障害者を対象としたものは非常に少ない.本稿では,アテトーゼ型脳性マヒによる構音障害者の音声認識の検討を行う.アテトーゼ型の構音障害者の場合,最初の動作において緊張状態により,通常よりも発話が不安定になる場合がある.そこで,我々はPCA(Principal Component Analysis)による発話変動にロバストな特徴量抽出法を提案してきた.本稿では,さらなる改善として,各話者の音素毎の置換,挿入の傾向を音声認識の過程に組み込むことが可能なメタモデル(MetamOdel)との統合を試み,その有効性を示す.%Recently, the accuracy of speaker-independent speech recognition has been remarkably improved by use of stochastic modeling of speech. However, there has been very little research on orally-challenged people, such as those with speech impediments. Therefore we have tried to build the acoustic model for a person with articulation disorders. The articulation of the first utterance tends to become unstable due to strain of a muscle and that causes degradation of speech recognition, where MFCC (Mel Frequency Cepstral Coefficients) is used as speech features. Therefore we proposed a robust feature extraction method based on PCA (Principal Component Analysis) instead of MFCC. In this paper, we discuss our effort to integrate a Metamodel and Acoustic model approach. Metamodel has a technique for incorporating a model of a speaker's confusion matrix into the ASR process in such a way as to increase recognition accuracy. Its effectiveness is confirmed by word recognition experiments.
展开▼