“ESVA: Enhancing Multimodal Emotion Recognition via Multi-Scale Audio Feature Extraction and Cross-Modal Temporal Alignment”. Informatica 49, no. 31 (December 23, 2025). Accessed June 21, 2026. https://www.informatica.si/index.php/informatica/article/view/12043.