Multi-Channel Audio - ai-coustics Docs

Core Behavior

The SDK accepts multi-channel input, but speech enhancement is applied on a mono mixdown internally.

Input channels are mixed down to mono for enhancement.
The mono signal is processed by the model.
The enhanced mono signal is mixed back into each output channel with a mix ratio dependent on the configured enhancement level.

If you need to process each channel independently, e.g. you have two separate audio streams on each channel, you must create a separate processor instance for each stream and configure each with num_channels = 1.

Configuration Rules

Use one processor configuration per stream and keep it stable while processing.

Set num_channels to the stream channel count during initialization.
Each process call must use the same channel count.
For fixed-frame mode (allow_variable_frames = false), each call must also use the initialized num_frames.
For variable-frame mode (allow_variable_frames = true), frame count per call must be less than or equal to the initialized num_frames.

Buffer Layouts

The SDK supports three memory layouts for multi-channel buffers:

Layout	Shape	Example (2 channels, 4 frames)
Interleaved	Single buffer	`[ch0_f0, ch1_f0, ch0_f1, ch1_f1, ...]`
Sequential	Single buffer	`[ch0_f0, ch0_f1, ch0_f2, ch0_f3, ch1_f0, ...]`
Planar	One buffer per channel	`audio[0] = [ch0_f0, ...]`, `audio[1] = [ch1_f0, ...]`

The planar layout supports up to 16 channels maximum.

Common Pitfalls

Initializing with one channel count and processing with another (AIC_ERROR_CODE_AUDIO_CONFIG_MISMATCH).
Sending the wrong number of samples for the configured frame size.
Assuming stereo channels are enhanced independently (they are not, they are mixed to mono first).
Re-initializing on a real-time audio thread (initialization allocates memory).

Documentation Index

​Core Behavior

​Configuration Rules

​Buffer Layouts

​Common Pitfalls

Core Behavior

Configuration Rules

Buffer Layouts

Common Pitfalls