Punchline-Driven Hierarchical Facial Animation via Multimodal Large Language Models
Abstract
Speech-driven 3D facial animation has achieved high phonetic realism, but current models often fail to convey the expressive peaks, such as punchlines, that are critical for engaging communication. This paper introduces a novel framework that addresses this gap by leveraging a Multimodal Large Language Model (MLLM) for a deep, semantic understanding of speech. Our core innovation is a system that explicitly models and animates the climax of an utterance. The framework first employs a multimodal punchline detection module to identify moments of high expressive intent from both acoustic and textual cues. This signal guides our Punchline-Driven Hierarchical Animator (PDHA), which functionally decomposes the face into distinct regions and generates motion in a coordinated cascade, allowing the punchline to dynamically amplify expression in the upper face while preserving articulatory precision in the mouth. A final cross-modal fusion decoder refines the output for precise temporal alignment. Comprehensive experiments on the VOCASET dataset show that our model not only sets a new state-of-the-art in geometric fidelity, reducing Vertex Error by 7.8% compared to the state-of-the-art FaceFormer baseline, but is also rated as significantly more expressive and natural in user studies (p < 0.01), confirming its ability to capture the emotional impact of a punchline.DOI:
https://doi.org/10.31449/inf.v49i25.11394Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







