Spoke Right

Analysis Pipeline

How Spoke Right processes recordings and generates pronunciation scores.

Recording

Audio is captured in M4A format using the device microphone. Unlike the streaming approach in Spoke Work and Spoke Class, Spoke Right records the full audio clip and uploads it for batch analysis.

Processing Steps

  1. Upload — M4A file is uploaded to Supabase Storage
  2. Attempt creation — A speaking attempt record is created in the database, linked to the exercise and reference text
  3. Analysis trigger — A Supabase Edge Function sends the audio to Deepgram's pronunciation assessment API
  4. Result polling — The app polls for results every 2 seconds, up to 90 attempts (3-minute timeout)
  5. Score delivery — Results include phoneme-level pronunciation data

Result Structure

The analysis produces:

ComponentDescription
Overall score0–100 composite rating
Accuracy scoreHow close to correct pronunciation
Word scoresPer-word accuracy breakdown
Phoneme dataIndividual sound analysis

Issue Detection

When the analysis identifies recurring problems:

  • Mispronounced phonemes are logged as pronunciation issues
  • Issues are sorted by occurrence count across all attempts
  • The Issues tab shows your most frequent problem areas
  • Tap an issue to see all related attempts

Text-to-Speech Reference

Spoke Right includes text-to-speech playback (via Expo Speech) so you can hear the correct pronunciation before or after recording.

On this page