Quail - ai-coustics Docs

The Quail models are purpose-built for Voice AI Agents and human-to-machine interactions. Unlike standard noise suppression, Quail is tuned to improve the performance of downstream Speech-to-Text (STT) engines. Quail is designed for far-field and multi-speaker environments. It does not suppress distant-sounding speech, making it better suited for speakerphone setups, meeting rooms, or situations with multiple participants spread across a space. For near-field, single-speaker isolation, see Quail Voice Focus.

The Quail models are designed to enhance the performance of Voice AI Agents and STT systems, and may not always produce the most natural-sounding audio for human listeners.It is expected that some noise and reverberation may remain in the output, as these can actually help improve STT accuracy by providing additional acoustic context.If your primary goal is to improve the listening experience for humans, we recommend using the Rook models instead.

Take a look at our ASR optimization guide and the model reference for available sizes and specs.

Quail VAD Improving ASR

⌘I