Attention-Guided Multimodal Signal Fusion with Transformer-Based Deep Transfer Learning for Real-Time Emotional Crisis Prediction in Students

Abstract

At present, early warning of students' emotional crises mostly relies on single data sources and traditional models, making it difficult to achieve high-precision, real-time monitoring with cross-individual generalization. To address this, we propose a pipeline integrating multi-modal physiological signals and deep transfer learning: 1) Signal preprocessing (adaptive denoising via attention mechanism and z-score normalization) to improve quality of ECG, EEG, GSR, and EMG signals; 2) Attention-guided cross-modal feature fusion using a physiological behavior mapping matrix to unify feature spaces; Evaluation metrics include accuracy, true positive rate (TPR), false positive rate (FPR), and single-sample processing latency. Baseline models for comparison include traditional CNN-LSTM and standard BERT-BASE. System tests show that the accuracy of feature extraction of the heart rate signal is 78.2%, and that of the skin electro cutaneous signal is 34.5%. After deep transfer learning optimization, the emotional crisis early warning accuracy on the small-sample cross-subject dataset (n=80) increased from 45.3% (baseline) to 88.1%, the false positive rate (FPR) dropped to 12.75%, the true positive rate (TPR) reached 87.7%, and the false negative rate (FNR) was 12.3%. The single-sample processing latency was 23.45 ms.

Authors

  • Wanhao Gao Student Affairs Department, Tianjin Chengjian University, TianJin 300384, China
  • Yaxin Wang School of Urban Arts, Tianjin Chengjian University, TianJin 300384, China

DOI:

https://doi.org/10.31449/inf.v50i11.12049

Downloads

Published

04/23/2026

How to Cite

Gao, W., & Wang, Y. (2026). Attention-Guided Multimodal Signal Fusion with Transformer-Based Deep Transfer Learning for Real-Time Emotional Crisis Prediction in Students. Informatica, 50(11). https://doi.org/10.31449/inf.v50i11.12049