Intellema
Back to Case Studies

Case Study

Medical ASR Saudi Arabic

Medical ASR Saudi Arabic

Category:

Speech & Healthcare AI

Impact:

6 months | $425k Seed Funding

Background

Healthcare professionals often spend significant time documenting patient visits, leading to increased administrative burden and reduced focus on patient care. For Arabic-speaking regions—particularly in dialect-rich contexts like Saudi Arabia existing speech recognition systems fail to capture medical terminology and dialectal nuances accurately. Sahl.ai required a custom ASR (Automatic Speech Recognition) system to convert spoken medical consultations into precise, structured clinical notes.

Project Goals

  • Develop an ASR system tailored for Saudi Arabic dialects in medical contexts
  • Reduce documentation burden for healthcare professionals
  • Build scalable pipelines for data collection, training, and evaluation
  • Apply advanced fine-tuning techniques (LoRA, QLoRA) for performance optimization
  • Generate and integrate synthetic speech data to enhance low-resource datasets
  • Improve model robustness with audio cleaning and self-supervised learning

Our Approach

Model Development & Training

Built end-to-end training and evaluation pipelines for ASR models. Conducted 200+ experiments across datasets to refine accuracy and robustness.

Fine-Tuning Techniques

Applied LoRA and QLoRA fine-tuning, optimizing large ASR models for medical speech.

Data Augmentation & Expansion

Generated synthetic speech data to expand training resources. Developed self-supervised ASR methods, doubling productivity in data annotation and collection.

Data Collection & Annotation

Supervised teams for speech dataset collection and medical annotation. Designed audio cleaning pipelines to improve transcription quality.

Deployment & MVP Delivery

Delivered a functional MVP, enabling the startup to validate the product with investors.

Key Results

  • MVP delivered, securing $425k seed funding for Sahl.ai
  • Dialect-specific ASR system tailored for Saudi Arabic medical speech
  • 200+ experiments run to optimize accuracy and robustness
  • Significant dataset expansion through synthetic data generation
  • Productivity in annotation doubled via self-supervised techniques
  • Improved input quality with advanced audio cleaning pipelines

Technologies Used

ASR Frameworks (Wav2Vec2.0, Whisper, HuBERT)
LoRA & QLoRA
Python & PyTorch
Python & PyTorch
Self-Supervised Learning
Synthetic Data Generation
Audio Processing Libraries

Connect with Intellema

Contact Us
Intellema – Intelligence Beyond Hype