Multi-Teacher Knowledge Distillation for Lightweight Speech Interaction in Embedded Educational Robots

Yu Hao

doi:10.31449/inf.v50i5.12129

Multi-Teacher Knowledge Distillation for Lightweight Speech Interaction in Embedded Educational Robots

Abstract

Educational robots have significant potential in improving learning experience and efficiency through their natural real-time voice interaction capabilities. However, existing mainstream end-to-end voice interaction models have problems with large parameter quantities and high computational costs, making it difficult to deploy efficiently on resource limited embedded educational robot platforms. The average inference delay is 810ms, which seriously affects real-time interaction. Moreover, traditional compression methods sacrifice understanding accuracy in complex scenarios, and the representation ability of small-scale models is limited; To this end, this study proposes a method for constructing a lightweight speech interaction system based on knowledge distillation. A deep neural network pre trained on a large-scale general corpus is used as the teacher model, and a multi-level knowledge transfer mechanism is established through differential masking to guide key feature learning, relationship information extraction module to obtain global correlations, and hierarchical loss function to balance distillation weights. The core knowledge of the teacher model is extracted into a lightweight student model driven by educational scenarios. The final student model contains only 20% of the parameters of the teacher model and maintains high accuracy on a benchmark test set simulating real educational environments. The speech recognition error rate is as low as 15.8% (12.6 percentage points lower than directly training small models of the same scale), and the inference delay is reduced from 810ms to 500ms By reducing by 38% and breaking through the real-time threshold of educational human-computer interaction, the model storage space has been compressed by over 80% (<350MB). It can run efficiently on low-power hardware platforms, effectively solving the balance between accuracy and efficiency in educational robot voice interaction, improving real-time interaction, robustness, and practicality, and providing reliable technical support for its wide application in various educational scenarios.

Authors

Yu Hao

DOI:

https://doi.org/10.31449/inf.v50i5.12129

Downloads

Published

02/02/2026

How to Cite

Hao, Y. (2026). Multi-Teacher Knowledge Distillation for Lightweight Speech Interaction in Embedded Educational Robots. Informatica, 50(5). https://doi.org/10.31449/inf.v50i5.12129

Download Citation

Issue

Vol. 50 No. 5 (2026): Online-only issue

Section

Online-only

License

Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.

All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.

Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.

Multi-Teacher Knowledge Distillation for Lightweight Speech Interaction in Embedded Educational Robots

Abstract

Authors

DOI:

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information