For real time implementation of large vocabulary speech recognition systems using hidden Markov models (HMMs), a huge amount of Viterbi scoring operations should be performed to select a vocabulary model which matches best with each incoming speech signal. While the general purpose digital signal processing (DSP) chips are widely used for feature extraction of speech signal, their architectures are not suitable for Viterbi scoring operations. For that reason, we need a specialized architecture to perform the Viterbi scoring operations efficiently for real time speech recognition especially when the vocabulary size becomes large.
In this thesis work, we implement a dedicated Viterbi scoring board for real time large vocabulary speech recognition using HMM. The Viterbi scoring board is implemented using the field programmable gate array (FPGA) chips. It performs Viterbi scoring operations in a logarithmic manner for efficient computation. In order to obtain the performance of the Viterbi scoring board, we construct a prototype speech recognition system. The system consists of a host computer (PC/AT), a DSP board, and a dedicated Viterbi scoring board we implemented. In the DSP board, we obtain a 16th order mel-scaled cepstrum and its vector quantized index for each frame of speech signal. Then the Viterbi scoring board receives the vector quantized index from the DSP board and updates all the state metrics at every frame intervals. As a result, the clock rate of 10 MHz, the speech recognition system can update 100,000 state metrics within a single frame of 10 ms, or equivalently, 3,300 words when the average number of states per word is 30.