Skip to main content
The Rook models are specifically optimized for human-to-human interaction in real-time constrained systems (e.g. voice calls). They reduce background noise and reverberation while preserving speech naturalness and intelligibility for human perception. In contrast to the Quail models, Rook will suppress any sound that is does not recognize as speech. This makes Rook suitable also as a pre-processing step for third-party VADs that do not perform well in noisy conditions. However, note that Rook will preserve both foreground and background speech.
See the model reference for available Rook sizes and sample rates.