Analysis Pipeline
How Spoke Right processes recordings and generates pronunciation scores.
Recording
Audio is captured in M4A format using the device microphone. Unlike the streaming approach in Spoke Work and Spoke Class, Spoke Right records the full audio clip and uploads it for batch analysis.
Processing Steps
- Upload — M4A file is uploaded to Supabase Storage
- Attempt creation — A speaking attempt record is created in the database, linked to the exercise and reference text
- Analysis trigger — A Supabase Edge Function sends the audio to Deepgram's pronunciation assessment API
- Result polling — The app polls for results every 2 seconds, up to 90 attempts (3-minute timeout)
- Score delivery — Results include phoneme-level pronunciation data
Result Structure
The analysis produces:
| Component | Description |
|---|---|
| Overall score | 0–100 composite rating |
| Accuracy score | How close to correct pronunciation |
| Word scores | Per-word accuracy breakdown |
| Phoneme data | Individual sound analysis |
Issue Detection
When the analysis identifies recurring problems:
- Mispronounced phonemes are logged as pronunciation issues
- Issues are sorted by occurrence count across all attempts
- The Issues tab shows your most frequent problem areas
- Tap an issue to see all related attempts
Text-to-Speech Reference
Spoke Right includes text-to-speech playback (via Expo Speech) so you can hear the correct pronunciation before or after recording.