> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ai-coustics.com/llms.txt
> Use this file to discover all available pages before exploring further.

# LiveKit Quickstart

> Learn how to use the ai-coustics plugin in your LiveKit voice agents for real-time speech enhancement.

Integrate ai-coustics speech enhancement into your LiveKit voice agents in minutes. The LiveKit ai-coustics plugins (`livekit-plugins-ai-coustics` for Python, `@livekit/plugins-ai-coustics` for Node.js) provide real-time noise cancellation optimized for human-to-machine audio, improving transcription accuracy for your AI agents.

<Info>
  The plugins support two authentication modes:

  * **LiveKit Cloud** - authenticate via `lk cloud auth`, no ai-coustics SDK key needed.
  * **ai-coustics SDK key** - pass your license key directly, suitable for self-hosted environments.
</Info>

## Setup Guide

Follow these steps to create a new LiveKit agent project with ai-coustics speech enhancement. Use the tabs in each code block to switch between Python and Node.js.

<Steps>
  <Step title="Create a LiveKit Cloud account">
    Sign up at [LiveKit Cloud](https://cloud.livekit.io/) if you don't already have an account.

    <Note>
      A LiveKit Cloud account is required when using the default LiveKit Cloud authentication mode. Run `lk cloud auth` once to authenticate. If you are running in a **self-hosted environment**, you can skip this step and pass an ai-coustics SDK key directly instead (see the self-hosted authentication step below).
    </Note>
  </Step>

  <Step title="Install the LiveKit CLI">
    Install the [LiveKit CLI tool](https://github.com/livekit/livekit-cli) for your platform.
  </Step>

  <Step title="Authenticate the CLI">
    ```sh theme={null}
    lk cloud auth
    ```

    <Note>
      **Cloud only.** This step links the CLI to your LiveKit Cloud account. If you are running in a **self-hosted environment**, skip this step.
    </Note>
  </Step>

  <Step title="Create a new agent project">
    <CodeGroup>
      ```sh Python theme={null}
      lk agent init my-agent --template agent-starter-python
      cd my-agent
      ```

      ```sh Node.js theme={null}
      lk agent init my-agent --template agent-starter-node
      cd my-agent
      ```
    </CodeGroup>

    <Note>
      **Node.js only.** LiveKit Agents for Node.js requires Node.js 20 or later. The starter project uses `pnpm`.
    </Note>
  </Step>

  <Step title="Add the ai-coustics plugin">
    <CodeGroup>
      ```sh Python theme={null}
      uv add livekit-plugins-ai-coustics
      ```

      ```sh Node.js theme={null}
      pnpm add @livekit/plugins-ai-coustics
      ```
    </CodeGroup>
  </Step>

  <Step title="Install dependencies">
    <CodeGroup>
      ```sh Python theme={null}
      uv sync
      ```

      ```sh Node.js theme={null}
      pnpm install
      ```
    </CodeGroup>
  </Step>

  <Step title="Download model files">
    <CodeGroup>
      ```sh Python theme={null}
      uv run src/agent.py download-files
      ```

      ```sh Node.js theme={null}
      npx livekit-agents download-files
      ```
    </CodeGroup>
  </Step>

  <Step title="Enable speech enhancement">
    Open your agent entry file (`src/agent.py` for Python, `src/index.ts` for Node.js) and add ai-coustics audio enhancement to your `session.start()` call:

    <CodeGroup>
      ```python Python theme={null}
      from livekit.plugins import ai_coustics

      session = AgentSession(
        vad=ai_coustics.VAD(),  # Add ai-coustics VAD to session setup
        # ...
      )

      await session.start(
          agent=Assistant(),
          room=ctx.room,
          room_options=room_io.RoomOptions(
              audio_input=room_io.AudioInputOptions(
                    # Add ai-coustics audio enhancement to audio input options
                    noise_cancellation=ai_coustics.audio_enhancement(
                      # - EnhancerModel.QUAIL_VF_L  (best for isolating the foreground speaker)
                      # - EnhancerModel.QUAIL_VF_S  (smaller, more efficient version of Quail VF)
                      # - EnhancerModel.QUAIL_L     (best for multiple/far-field speakers)
                      model=ai_coustics.EnhancerModel.QUAIL_VF_L,
                      # auth defaults to Auth.livekit_cloud() - omit it when using LiveKit Cloud.
                      # For self-hosted environments, pass your license key explicitly:
                      #   auth=ai_coustics.Auth.ai_coustics_api(license_key="YOUR_LICENSE_KEY"),
                      # - enhancement_level = 0.5 (conservative, foreground speech is always preserved)
                      # - enhancement_level = 0.8 (balanced, optimal word error rate on challenging data)
                      # - enhancement_level = 1.0 (aggressive, maximum suppression of interfering speech)
                      # More info: https://docs.ai-coustics.com/models/speech-enhancement/speech-enhancement-for-voice-ai-systems
                      model_parameters=ai_coustics.ModelParameters(enhancement_level=0.8),
                      # VAD Parameters Info: https://docs.ai-coustics.com/models/voice-activity-detection/quail-vad
                      vad_settings=ai_coustics.VadSettings(
                        # 0.0 to 1.0 seconds
                        speech_hold_duration=0.03,
                        # 1.0 to 15.0
                        sensitivity=6.0,
                        # 0.0 to 1.0 seconds
                        minimum_speech_duration=0.0,
                      )
                  ),
              )
          ),
      )
      ```

      ```ts Node.js theme={null}
      import { ServerOptions, cli, defineAgent, inference, voice } from '@livekit/agents';
      import {
        EnhancerModel,
        VadSettings,
        audioEnhancement,
      } from '@livekit/plugins-ai-coustics';
      import dotenv from 'dotenv';
      import { fileURLToPath } from 'node:url';
      import { Agent } from './agent';

      // Load environment variables from a local file.
      // Make sure to set LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET
      // when running locally or self-hosting your agent server.
      dotenv.config({ path: '.env.local' });

      export default defineAgent({
        entry: async (ctx) => {
          // Set up a voice AI pipeline using OpenAI, Cartesia, Deepgram, and the LiveKit turn detector
          const session = new voice.AgentSession({
            // Speech-to-text (STT) is your agent's ears, turning the user's speech into text that the LLM can understand
            // See all available models at https://docs.livekit.io/agents/models/stt/
            stt: new inference.STT({
              model: 'deepgram/nova-3',
              language: 'multi',
            }),

            // Text-to-speech (TTS) is your agent's voice, turning the LLM's text into speech that the user can hear
            // See all available models as well as voice selections at https://docs.livekit.io/agents/models/tts/
            tts: new inference.TTS({
              model: 'cartesia/sonic-3',
              voice: '9626c31c-bec5-4cca-baa8-f8ba9e84c8bc',
            }),

            // Turn detection determines when the user is speaking and when the agent should respond.
            // The LiveKit audio turn detector is a multimodal model that encodes the user's audio
            // directly to predict end of turn. It's built into the SDK (no extra plugin) and
            // AgentSession supplies the required VAD automatically.
            // See more at https://docs.livekit.io/agents/logic/turns/turn-detector/
            turnHandling: {
              turnDetection: new inference.TurnDetector(),
              // Allow the LLM to generate a response while waiting for the end of turn
              preemptiveGeneration: { enabled: true },
            },
          });

          // Start the session, which initializes the voice pipeline and warms up the models
          await session.start({
            agent: new Agent(),
            room: ctx.room,
            inputOptions: {
              // ai-coustics QUAIL audio enhancement for noise cancellation
              // Works for both WebRTC and telephony (SIP) participants
              noiseCancellation: audioEnhancement({
                // - EnhancerModel.QuailVfL (best for isolating the foreground speaker)
                // - EnhancerModel.QuailVfS (smaller, more efficient version of Quail VF)
                // - EnhancerModel.QuailL   (best for multiple/far-field speakers)
                model: EnhancerModel.QuailVfS,
                // auth defaults to LiveKit Cloud - omit it when using LiveKit Cloud.
                // For self-hosted environments, pass your license key explicitly.
                // See the self-hosted authentication step below.
                // - enhancementLevel = 0.5 (conservative, foreground speech is always preserved)
                // - enhancementLevel = 0.8 (balanced, optimal word error rate on challenging data)
                // - enhancementLevel = 1.0 (aggressive, maximum suppression of interfering speech)
                // More info: https://docs.ai-coustics.com/models/speech-enhancement/speech-enhancement-for-voice-ai-systems
                modelParameters: { enhancementLevel: 0.8 },
                // VAD Parameters Info: https://docs.ai-coustics.com/models/voice-activity-detection/quail-vad
                vadSettings: VadSettings.new({
                  // 0.0 to 1.0 seconds
                  speechHoldDuration: 0.03,
                  // 1.0 to 15.0
                  sensitivity: 6.0,
                  // 0.0 to 1.0 seconds
                  minimumSpeechDuration: 0.0,
                }),
              }),
            },
          });

          // Join the room and connect to the user
          await ctx.connect();

          // Greet the user on joining
          session.generateReply({
            instructions: 'Greet the user in a helpful and friendly manner.',
          });
        },
      });

      // Run the agent server
      cli.runApp(
        new ServerOptions({
          agent: fileURLToPath(import.meta.url),
          agentName: 'my-agent',
        }),
      );
      ```
    </CodeGroup>

    <Note>
      **Node.js only.** If your starter project already has an LLM configured in `AgentSession`, keep it in place and only add the `inputOptions.noiseCancellation` block to `session.start()`.
    </Note>
  </Step>

  <Step title="Run the agent">
    <CodeGroup>
      ```sh Python theme={null}
      uv run python src/agent.py console
      ```

      ```sh Node.js theme={null}
      pnpm dev
      ```
    </CodeGroup>

    <Check>
      Your agent is now running with ai-coustics Quail Voice Focus. For Python, you can start talking to it directly in the console. For Node.js, open the LiveKit Agent Console for your project, start a session, and speak to your agent. The Voice Focus models will elevate the foreground speaker while suppressing both interfering speech and background noise.
    </Check>
  </Step>

  <Step title="Voice Focus and Multi-Speaker Support">
    You can use [Quail Voice Focus and Quail for multi-speaker](/models/speech-enhancement/quail) scenarios in this integration. Support for [Voice Activity Detection](/models/voice-activity-detection/quail-vad) will be added in the future.
  </Step>

  <Step title="Self-hosted authentication (optional)">
    If you are not using LiveKit Cloud, pass your ai-coustics license key directly via the `auth` parameter:

    <CodeGroup>
      ```python Python theme={null}
      noise_cancellation=ai_coustics.audio_enhancement(
          model=ai_coustics.EnhancerModel.QUAIL_VF_L,
          auth=ai_coustics.Auth.ai_coustics_api(license_key="YOUR_LICENSE_KEY"),
      )
      ```

      ```ts Node.js theme={null}
      import { Auth, EnhancerModel, audioEnhancement } from '@livekit/plugins-ai-coustics';

      noiseCancellation: audioEnhancement({
        model: EnhancerModel.QuailVfS,
        auth: Auth.aiCousticsApi('YOUR_LICENSE_KEY'),
      }),
      ```
    </CodeGroup>

    Your license key can be generated on the [ai-coustics developer platform](https://developers.ai-coustics.com).
  </Step>
</Steps>

## Available Models

The LiveKit plugins do not currently have support for loading model files. Instead, they have a limited selection of models embedded in the plugins themselves.

The models currently available in the plugins are:

* Quail L (16 kHz): `EnhancerModel.QUAIL_L` (Python), `EnhancerModel.QuailL` (Node.js)
* Quail Voice Focus 2.1 L (16 kHz): `EnhancerModel.QUAIL_VF_L` (Python), `EnhancerModel.QuailVfL` (Node.js)
* Quail Voice Focus 2.1 S (16 kHz): `EnhancerModel.QUAIL_VF_S` (Python), `EnhancerModel.QuailVfS` (Node.js)

## Next Steps

<CardGroup cols={2}>
  <Card title="Plugin on PyPI" icon="readme" href="https://pypi.org/project/livekit-plugins-ai-coustics/">
    Discover the LiveKit Python plugin package.
  </Card>

  <Card title="Plugin on npm" icon="readme" href="https://www.npmjs.com/package/@livekit/plugins-ai-coustics">
    Discover the LiveKit Node.js plugin package.
  </Card>

  <Card title="Quail & Voice Focus" icon="microphone" href="/models/speech-enhancement/quail">
    Learn about Quail and Voice Focus models for LiveKit.
  </Card>

  <Card title="LiveKit Agents" icon="robot" href="https://docs.livekit.io/agents/start/voice-ai/">
    Learn more about building voice agents with LiveKit.
  </Card>
</CardGroup>
