[1]
Su, T. and Hu, C. 2025. ESVA: Enhancing Multimodal Emotion Recognition via Multi-Scale Audio Feature Extraction and Cross-Modal Temporal Alignment. Informatica. 49, 31 (Dec. 2025). DOI:https://doi.org/10.31449/inf.v46i31.12043.