Pipecat Quickstart - ai-coustics Docs

This guide provides a quickstart for integrating the ai-coustics filter (AICFilter) into your Pipecat applications.

Prerequisites

Before you start, make sure you have a valid SDK key from the developer platform.

Installation

To use AICFilter, you need to install the aic extra for pipecat-ai (not needed when using uv):

pip install pipecat-ai[aic,local,webrtc] loguru pyaudio fastapi uvicorn dotenv

Usage

The AICFilter can be easily integrated into a Pipecat pipeline between an audio input transport (e.g., microphone) and an audio output transport (e.g., speaker). Here’s a complete example of a simple Pipecat application that uses the AICFilter.

# /// script
# requires-python = ">=3.10,<3.14"
# dependencies = [
#     "pipecat-ai[aic,local,webrtc]",
#     "loguru",
#     "llvmlite",
#     "pyaudio",
#     "fastapi",
#     "uvicorn",
#     "dotenv",
#     "pipecat-ai-small-webrtc-prebuilt",
# ]
# ///
import os

from loguru import logger

from pipecat.audio.filters.aic_filter import AICFilter
from pipecat.frames.frames import Frame, InputAudioRawFrame, OutputAudioRawFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.transports.base_transport import BaseTransport, TransportParams


# Loopback Processor
class AudioFrameConverter(FrameProcessor):
    async def process_frame(self, frame: Frame, direction: FrameDirection):
        await super().process_frame(frame, direction)

        if isinstance(frame, InputAudioRawFrame):
            output_frame = OutputAudioRawFrame(
                audio=frame.audio,
                sample_rate=frame.sample_rate,
                num_channels=frame.num_channels,
            )
            await self.push_frame(output_frame, direction)
        else:
            await self.push_frame(frame, direction)


# Bot Logic
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
    logger.info("Bot starting: Direct Audio Loopback with AIC Filter")

    converter = AudioFrameConverter()
    pipeline = Pipeline(
        [
            transport.input(),
            converter,
            transport.output(),
        ]
    )
    task = PipelineTask(
        pipeline,
        params=PipelineParams(),
    )

    @transport.event_handler("on_client_connected")
    async def on_client_connected(transport, client):
        logger.info("WebRTC Client Connected")

    @transport.event_handler("on_client_disconnected")
    async def on_client_disconnected(transport, client):
        logger.info("WebRTC Client Disconnected")
        await task.cancel()

    runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
    await runner.run(task)


async def bot(runner_args: RunnerArguments):
    # Initialize ai-coustics filter
    aic_filter = AICFilter(
        license_key=os.environ["AIC_SDK_LICENSE"],
        model_id="quail-vf-2.0-l-16khz",  # or "quail-l-16khz", "quail-s-8khz", etc.
    )
    transport_params = {
        "webrtc": lambda: TransportParams(
            audio_in_enabled=True,
            audio_out_enabled=True,
            audio_in_filter=aic_filter,
        )
    }
    transport = await create_transport(runner_args, transport_params)
    await run_bot(transport, runner_args)


if __name__ == "__main__":
    from pipecat.runner.run import main

    main()

Running the Example

Save the code

Save the code above as bot.py.

Set Environment Variables

Set the necessary environment variable in your terminal:

export AIC_SDK_LICENSE="YOUR_AIC_LICENSE_KEY"

Replace the placeholder value with your actual SDK key.

Run the Application

Execute the script from your terminal:

python bot.py

Or use uv:

uv run bot.py

You can now navigate to http://localhost:7860 and click the green ‘Connect’ button in the top right corner.

The Quail models are designed to enhance the performance of Voice AI Agents and STT systems, and may not always produce the most natural-sounding audio for human listeners.It is expected that some noise and reverberation may remain in the output, as these can actually help improve STT accuracy by providing additional acoustic context.

Architecture Overview

In Pipecat, audio filters run inside the input transport. They process raw input audio before it reaches any downstream processors. The AICFilter plugs into this mechanism via the audio_in_filter parameter on the transport.

Key Points

AICFilter runs first. It processes the raw input audio, before anything else in the pipeline sees it.
VAD is a separate, standalone component. The standalone Quail VAD analyzer performs its own noise robust voice activity detection and is wired into the pipeline independently of the filter.

AICFilter Integration

The AICFilter class inherits from Pipecat’s BaseAudioFilter. When the transport starts, it calls AICFilter.start(sample_rate), which:

Loads the model: Either from a local file (model_path) or by downloading it from the CDN (model_id). Models are cached and shared across filter instances via a singleton AICModelManager.
Creates the processor: An async processor (ProcessorAsync) is initialized with the model, license key, and optimal configuration for the given sample rate.
Initializes VAD and enhancement contexts: The processor exposes a ProcessorContext for controlling parameters (bypass, enhancement level) and a VadContext for Voice Activity Detection parameters.

Standalone Quail VAD

For voice activity detection, ai-coustics provides a standalone, noise robust Quail VAD analyzer. The AICQuailVADAnalyzer is an independent component that performs its own audio analysis, so it works whether or not the AICFilter is present in the pipeline. The analyzer ships with the same aic extra as the filter:

pip install "pipecat-ai[aic]"

Construct it with your license key and attach it through the user aggregator with LLMUserAggregatorParams.vad_analyzer. By default it uses the quail-vad-2.0-xxs-16khz model.

import os
from pipecat.audio.filters.aic_filter import AICFilter
from pipecat.audio.vad.aic_quail_vad import AICQuailVADAnalyzer
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.transports.services.daily import DailyTransport, DailyParams

# Create the AIC filter for audio enhancement
aic_filter = AICFilter(
    license_key=os.environ["AIC_SDK_LICENSE"],
    model_id="quail-vf-2.0-l-16khz",
)

# Create standalone Quail VAD 2.0 analyzer
aic_vad = AICQuailVADAnalyzer(
    license_key=os.environ["AIC_SDK_LICENSE"],
)

transport = DailyTransport(
    room_url,
    token,
    "Bot",
    DailyParams(
        audio_in_enabled=True,
        audio_out_enabled=True,
        audio_in_filter=aic_filter,
    ),
)

user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        vad_analyzer=aic_vad,
    ),
)

The analyzer also accepts optional tuning parameters such as sensitivity (speech probability threshold, 0.0–1.0), speech_hold_duration, and minimum_speech_duration.

The standalone Quail VAD is a separate component from the AICFilter. The audio filter attaches to the transport via audio_in_filter, while the VAD analyzer attaches to the user aggregator via LLMUserAggregatorParams.vad_analyzer (TransportParams.vad_analyzer was removed in Pipecat 1.0).

AICFilter

Pipecat’s documentation on AICFilter.

Quail VAD Analyzer

Pipecat’s documentation on AICQuailVADAnalyzer.

​Prerequisites

​Installation

​Usage

​Running the Example

​Architecture Overview

​Key Points

​AICFilter Integration

​Standalone Quail VAD

​Further Reading

AICFilter

Quail VAD Analyzer

Prerequisites

Installation

Usage

Running the Example

Architecture Overview

Key Points

AICFilter Integration

Standalone Quail VAD

Further Reading