Summary: 2006 PhD research focusing on a technique to modify input neutral audio and synthesize visual speech incorporating effects of emotion and fluency.Interdisciplinary Collaborations; Motion Capture ACCAD Students: Arunachalam Somasundaram OSU Faculty/Staff: Rich Parent, CSE Vita Berezina-Blackburn, ACCAD Summary: Expressive facial speech animation is a challenging topic of great interest to the computer graphics community. Adding emotions to audio-visual speech animation is very important for realistic facial animation. The complexity of neutral visual speech synthesis is mainly attributed to co-articulation. Co-articulation is the phenomenon due to which the facial pose of the current segment of speech is affected by the neighboring segments of speech. The inclusion of emotions and fluency effects in speech adds to that complexity because of the corresponding shape and timing modifications brought about in speech. Speech is often accompanied by supportive visual prosodic elements such as motion of the head, eyes, and eyebrow, which improve the intelligibility of speech, and they need to be synthesized. In this dissertation, we present a technique to modify input neutral audio and synthesize visual speech incorporating effects of emotion and fluency. Visemes, which are visual counterpart of phonemes, are used to animate speech. We motion capture 3-D facial motion and extract facial muscle positions of expressive visemes. Our expressive visemes capture the pose of the entire face. The expressive visemes are blended using a novel constraint-based co-articulation technique that can easily accommodate the effects of emotion. We also present a visual prosody model for emotional speech, based on motion capture data, that exhibits non-verbal behaviors such as eyebrow motion and overall head motion. Completed in 2006.