(1)

Li, X. Cross-Modal Transformer With Dynamic Attention Fusion for Emotion Recognition in Music via Audio-Lyrics Alignment. IJCAI 2025, 49.