Core Behavior
The SDK accepts multi-channel input, but speech enhancement is applied on a mono mixdown internally.- Input channels are mixed down to mono for enhancement.
- The mono signal is processed by the model.
- The enhanced mono signal is mixed back into each output channel with a mix ratio dependent on the configured enhancement level.
num_channels = 1.
Configuration Rules
Use one processor configuration per stream and keep it stable while processing.- Set
num_channelsto the stream channel count during initialization. - Each process call must use the same channel count.
- For fixed-frame mode (
allow_variable_frames = false), each call must also use the initializednum_frames. - For variable-frame mode (
allow_variable_frames = true), frame count per call must be less than or equal to the initializednum_frames.
Buffer Layouts
The SDK supports three memory layouts for multi-channel buffers:| Layout | Shape | Example (2 channels, 4 frames) |
|---|---|---|
| Interleaved | Single buffer | [ch0_f0, ch1_f0, ch0_f1, ch1_f1, ...] |
| Sequential | Single buffer | [ch0_f0, ch0_f1, ch0_f2, ch0_f3, ch1_f0, ...] |
| Planar | One buffer per channel | audio[0] = [ch0_f0, ...], audio[1] = [ch1_f0, ...] |
The planar layout supports up to 16 channels maximum.
Common Pitfalls
- Initializing with one channel count and processing with another (
AIC_ERROR_CODE_AUDIO_CONFIG_MISMATCH). - Sending the wrong number of samples for the configured frame size.
- Assuming stereo channels are enhanced independently (they are not, they are mixed to mono first).
- Re-initializing on a real-time audio thread (initialization allocates memory).