See Improving ASR with Voice Focus and the model reference for available sizes and specs.
Voice Focus
Quail Voice Focus
Near-field primary-speaker isolation for Voice AI.
Quail Voice Focus is optimized for near-field voice interactions.
It prioritizes speech that sounds close to the microphone and suppresses speech that sounds distant, along with background noise.
This makes it ideal for single-user, close-talk use cases (e.g., headsets or handheld devices).
Voice Focus 2.1 listens for a moment at the start of each session before applying suppression. During this warm-up, audio may sound closer to the original. Once a clear primary speaker is detected (typically within a few seconds) full suppression kicks in. The sooner the primary speaker talks, the shorter the warm-up period. On very short clips where the primary speaker doesn’t get a chance to speak, suppression may not fully activate. This is by design, as Voice Focus 2.1 prioritizes accuracy over speed and will not suppress a speaker it hasn’t confidently identified yet.
When used with enhancement level of 100%, Quail Voice Focus 2.1 may also be used as a pre-processing step for third-party VADs that do not perform well in noisy conditions,
as it will more aggressively suppress background noise and speech, which can help improve VAD accuracy, but may also harm the ASR’s performance.