Fine-Tuning OpenAI Whisper-Small for Domain-Specific Medical Speech Recognition within a Microservice Architecture

Abstract

We fine-tune Whisper-small (244M parameters) on 8.5 hours of in-domain medical audio and evaluatewith word error rate (WER). Compared to an unadapted Whisper-small baseline, our fine-tuned modelreduces WER from ∼63% to ∼32%. While the relative gain is substantial, this accuracy is not suitablefor unsupervised clinical use; we position the system as a clinician-in-the-loop assistant. We also describedeployment as an on-premise microservice and report latency/throughput considerations.

Authors

  • Alaeddine Moussa Université de la Manouba, École Nationale des Sciences de l’Informatique, Tunisia
  • Noursene Drine Universite de Tunis El Manar, École Nationale des Sciences de l’Informatique, Tunisia

DOI:

https://doi.org/10.31449/inf.v50i6.12075

Downloads

Published

02/21/2026

How to Cite

Moussa, A., & Drine, N. (2026). Fine-Tuning OpenAI Whisper-Small for Domain-Specific Medical Speech Recognition within a Microservice Architecture. Informatica, 50(6). https://doi.org/10.31449/inf.v50i6.12075