This thesis describes a novel method for creating a visual-speech animation with emotional expressions. To obtain convincing facial animation, we adopt an "example-based" approach. We prepare two types of examples called viseme models and expression models, respectively. The former is for lip-synchronization, and the latter is for facial expressions. The viseme models represent lip shapes related to phonemes such as vowels and consonants. We use an existing TTS(Text-to-Speech) system to obtain a phonetic transcript. Given a phonetic transcript, we select the corresponding sequence of viseme models. Those are interpolated to produce a facial animation. The expression models represent key facial expressions, each representing emotions such as happiness, sadness, surprise, fear, and anger. Given emotional parameters, we blend those expression models to express time-varying emotions for facial animation. We propose a user interface to produce continuously-changing emotional parameter vectors. While speaking, people communicate emotions along with audible words. Thus, realistic visual speech animation requires both lip-synchronization and facial expressions. We present an importance-based approach to combine both of them without conflicts.