BFS-CNN-ECA-GMP-GRU-MSP: An Enhanced Cross-Perspective GaitRecognition Model with Efficient Channel Attention and Cosine-ConsistentMetric Learning

Lijuan Gao; Jieran Liu

doi:10.31449/inf.v50i13.14181

Abstract

Cross-view gait recognition remains vulnerable to viewpoint shifts and appearance changes, especiallyunder carrying and clothing covariates. We propose BFS-CNN-ECA-GMP-GRU-MSP, an enhanced versionof our previous BFS-CNN-GMP-GRU-MSP framework, by introducing two upgrades: multi-stagelightweight channel recalibration with Efficient Channel Attention (ECA) and cosine-consistent metriclearning through cosine batch-hard triplet loss, a cosine classifier, and L2-normalized embeddings. Experimentsare first conducted on CASIA-B under the same legacy closed-set protocol used in our earlierInformatica study (gallery: NM#01–04; probe: NM#05–06, BG#01–02, CL#01–02; same-view matchesexcluded). This protocol is retained to isolate architecture-level improvements and is interpreted as awithin-protocol comparison rather than a subject-disjoint generalization benchmark. Under this setting,the proposed model reaches mean Rank-1 accuracies of 99.96% (NM), 99.74% (BG), and 98.38% (CL), improvingthe baseline by +2.96, +5.74, and +7.38 percentage points, respectively. To probe unseen-subjectbehavior more directly, we further report a supplementary subject-disjoint split (subjects 001–074 for trainingand 075–124 for testing), where the full model attains 97.73% (NM), 87.98% (BG), and 64.00% (CL).Under this stricter split, the clearest effect of ECA appears under clothing variation, where the full modelexceeds w/o ECA by 1.82 percentage points on CL, while the ECA branch still introduces only 13 learnableparameters (k=3/5/5 for 64/128/256 channels). These results support the proposed modifications asa lightweight and effective enhancement for protocol-matched cross-view gait recognition, while broadermulti-split subject-disjoint and open-set validation remains future work.

References

[1] Liu J, Wang W. A Cross-Perspective Gait Recognition Framework Integrating Breadth-First Search and Multi-Scale Feature Map Interaction. Informatica, 2022.

[2] Sepas-Moghaddam A, Etemad A. Deep Gait Recognition: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 264-284.

[3] Fan C, Liang J, Shen C, et al. OpenGait: Revisiting Gait Recognition Towards Better Practicality. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[4] Munusamy V, Senthilkumar S. Emerging Trends in Gait Recognition Based on Deep Learning: A Survey. PeerJ Computer Science, 2024, 10: e2158.

[5] Parashar A, Parashar A, Ding W. Deep Learning Pipelines for Recognition of Gait Biometrics with Covariates: A Comprehensive Review. Artificial Intelligence Review, 2023, 56(18): 8889-8953.

[6] Khaliluzzaman M, Uddin A, Deb K, et al. Person Recognition Based on Deep Gait: A Survey. Sensors, 2023, 23(10): 4875.

[7] Yan S, Hu L, Xueling F. GaitASMS: Gait Recognition by Adaptive Structured Spatial Representation and Multi-Scale Temporal Aggregation. Neural Computing & Applications, 2024, 36(13): 7057-7069.

[8] Chao H, He Y, Zhang J, et al. GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(01): 8126-8133.

[9] Fan C, Peng Y, Cao C, et al. GaitPart: Temporal Part-Based Model for Gait Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 14225-14233.

[10] Lin B, Zhang S, Yu X. Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021: 14648-14656.

[11] Wang M, Guo X, Lin B, et al. DyGait: Exploiting Dynamic Representations for High-performance Gait Recognition. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023: 13424-13433.

[12] Wang Z, Hou S, Zhang M, et al. QAGait: Revisit Gait Recognition from a Quality Perspective. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(6): 5785-5793.

[13] Wang Q, Wu B, Zhu P, et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 11534-11542.

[14] Hu J, Shen L, Sun G. Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[15] Yu S, Tan D, Tan T. A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition. Proceedings of the 18th International Conference on Pattern Recognition (ICPR), 2006: 441-444.

[16] Fan C, Ma J, Jin D, et al. SkeletonGait: Gait Recognition Using Skeleton Maps. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(2): 1662-1669.

[17] Guo M, Xu T, Liu J, et al. Attention Mechanisms in Computer Vision: A Survey. Computational Visual Media, 2022, 8(3): 331-368.

[18] Hermans A, Beyer L, Leibe B. In Defense of the Triplet Loss for Person Re-Identification. arXiv preprint arXiv:1703.07737, 2017.

[19] Wojke N, Bewley A. Deep Cosine Metric Learning for Person Re-identification. IEEE Winter Conference on Applications of Computer Vision (WACV), 2018: 748-756.

[20] Luo H, Gu Y, Liao X, et al. Bag of Tricks and a Strong Baseline for Deep Person Re-Identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019.

[21] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the Inception Architecture for Computer Vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 2818-2826.

[22] Yu F, Koltun V. Multi-Scale Context Aggregation by Dilated Convolutions. International Conference on Learning Representations (ICLR), 2016.

[23] Chung J, Gulcehre C, Cho K, et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. NIPS Workshop on Deep Learning, 2014.

[24] Van der Maaten L, Hinton G. Visualizing Data Using t-SNE. Journal of Machine Learning Research, 2008, 9(86): 2579-2605.

BFS-CNN-ECA-GMP-GRU-MSP: An Enhanced Cross-Perspective GaitRecognition Model with Efficient Channel Attention and Cosine-ConsistentMetric Learning

Abstract

References

Authors

DOI:

Keywords:

Downloads

Published

Issue

Section

License

How to Cite

Developed By

Information