文档介绍:Speech Recognition 语音识别—— By Terry Speech Recognition ? Speech recognition is a high technology of processing voice signal into corresponding texts mands by machine recognition and understanding. ? Speech recognition technology has involved signal processing, pattern recognition, probability theory and information theory, vocal mechanism, hearing mechanism and artificial intelligence. ? Speech recognition technology is mainly consist of three module , including feature extraction, pattern matching technology and model training. Speech Recognition ? The History of Speech Recognition Development 1959 Ten phoneme recognition system Audry System,Bell Labs 20th,50s late 60s to early 70s LPC,DTW VQ,HMM Sphinx System, Carnegie Mellon University, ANN,HMM 80s 90s IBM , Apple , AT & T and NTT A Hot Area in AI, More processing Method, Nowadays Speech Recognition ? Category of method : ? Isolated word recognition ? Connected word recognition ? Continuous speech Recognition ? Specific person recognition ? Non-specific person recognition ? Small vocabulary ? Median vocabulary ? Large vocabulary ? Infinite vocabulary Speech Recognition ? Mainly Methods: ? Template Matching ? DTW( Dynamic Time Warping ) ? VQ(Vector Quantization) ? HMM ? DHMM(Discrete Hidden Markov Model) ? CHMM(Continuous Hidden Markov Model) ? SCHMM(Semi-Continuous Hidden Markov Model) ? ANN(Artificial ) Speech Recognition ? Signal Pre-processing ? Framming-5ms to 50ms ? Endpoint detection-detect the starting point and terminal point ? Speech Enhancement -inhibit noise and improve speech quality ? ICA-ponent Analysis Speech Recognition ? Feature Extraction ? LPC-Linear Prediction coefficient ?-Linear Prediction Cepstrum Coefficient ?-Mel Frequency Cepstrum Coefficient ? Cepstrum: ??????? n jn jwenxeX ?)()(??????? n jn jwenxeX ?)()(??????? n jn jwenxeX ?)()(?????deeXmc jm jw???|)(| ln2 1)( Speech Recognition ? Speech Recognition ? Template Matching ? DTW( Dynamic Time Warping ) ? VQ(Ve