Introduction to Active Call

Active Call is a standalone project separated from Active Call. It acts as a dedicated User Agent that handles all telephony protocols and media processing, providing an easy-to-use WebSocket API for external control.

Decoupled Architecture

Arch

Active Call features an innovative decoupled architecture that completely separates telephony & media processing from business logic:

  • Server: Handles SIP, RTP,audio processing and voice service (ASR/TTS) integration.
  • Client: LLM integration, controlling the Active Call via WebSocket.
  • WebSocket Protocol: Enables real-time interaction through a Command/Event pattern.

This architecture enables developers to:

  • Focus on business logic: No need to understand low-level details such as audio processing, SIP, RTP protocols, and other complex telephony stacks.
  • Language Independence: Build your AI agent using any programming language (Python, Go, Node.js, Rust, etc.) that supports WebSockets.
  • Tech Stack Freedom: Use any AI framework or LLM (OpenAI, LangChain, etc.) without restriction.
  • Independent Scaling: Deploy and scale your AI logic and media processing components independently.
  • Simple Debugging: Isolated domains ensure that AI logic issues don’t interfere with stable media transmission.

Protocol Support

Active Call supports industry-standard communication protocols:

  • SIP/RTP: Full compatibility with standard SIP trunking and VoIP hardware.
  • WebRTC: Direct voice interaction for browsers and mobile applications.
  • WebSocket: Native audio stream transmission for low-latency integrations.

Command/Event Pattern

Active Call uses an intuitive Command/Event pattern for its WebSocket control interface:

  • Command: Clients send instructions to Active Call to control behaviors (e.g., dial, play, record, transfer).
  • Event: Active Call pushes real-time status updates and processing results (e.g., ASR results, call status changes).
Event

Plugin-based Architecture

Active Call features a plugin-based architecture that supports multiple mainstream service providers, giving you the flexibility to:

  • Freely switch providers: Select service providers based on cost, performance, and feature requirements.
  • Customize plugins: Implement your own ASR/TTS plugins to integrate self-deployed services.
Plugin

Audio Processing Capabilities

Active Call includes a complete audio processing pipeline, delivering enterprise-level voice quality:

Audio
  • Voice Activity Detection: Intelligently detects voice activity and notifies clients.
  • Intelligent Noise Reduction: Removes background noise in real-time, improving ASR recognition accuracy.
  • Gain Control: Automatically adjusts volume to ensure stable and clear speech.

Comparison with Pipecat

FeatureActive CallPipecat/Monolithic Framework
Architecture PatternDecoupled architectureMonolithic architecture
DeploymentDistributed deploymentSingle process deployment
Learning CurveLow, only requires understanding the APIHigh, requires understanding audio processing details
Debugging DifficultyLow, problem domains are isolatedHigh, AI and media issues are tightly coupled
PerformanceHigh-performance Rust implementation with multi-threaded parallelismSingle process with GIL limitations (Python)
MaintainabilityModular design with easy maintenanceHigh coupling, upgrades affect the entire stack

Use Cases

  • Enterprise applications and production environments.
  • Systems requiring high concurrency and high availability.
  • Large-scale projects with multi-team collaboration.
  • Complex voice interaction scenarios (IVR, intelligent customer service, voice assistants).
  • Applications with high performance and stability requirements.
  • Systems requiring integration with existing telephone infrastructure.

Summary

  1. Low learning curve: No need to understand specialized knowledge such as audio processing, SIP protocols, or complex telephony hardware.
  2. High development efficiency: Focus on business logic and rapidly iterate AI features without worrying about media processing.
  3. Easy testing: WebSocket interface makes unit testing and integration testing straightforward.
  4. Highly scalable: Supports distributed deployment and horizontal scaling.
  5. Tech stack freedom: No restrictions on framework choices.
  6. Production ready: Built-in monitoring, logging, and error handling mechanisms.