Introduction to Active Call
Active Call is a standalone project separated from Active Call. It acts as a dedicated User Agent that handles all telephony protocols and media processing, providing an easy-to-use WebSocket API for external control.
Decoupled Architecture
Active Call features an innovative decoupled architecture that completely separates telephony & media processing from business logic:
- Server: Handles SIP, RTP,audio processing and voice service (ASR/TTS) integration.
- Client: LLM integration, controlling the Active Call via WebSocket.
- WebSocket Protocol: Enables real-time interaction through a Command/Event pattern.
This architecture enables developers to:
- Focus on business logic: No need to understand low-level details such as audio processing, SIP, RTP protocols, and other complex telephony stacks.
- Language Independence: Build your AI agent using any programming language (Python, Go, Node.js, Rust, etc.) that supports WebSockets.
- Tech Stack Freedom: Use any AI framework or LLM (OpenAI, LangChain, etc.) without restriction.
- Independent Scaling: Deploy and scale your AI logic and media processing components independently.
- Simple Debugging: Isolated domains ensure that AI logic issues don’t interfere with stable media transmission.
Protocol Support
Active Call supports industry-standard communication protocols:
- SIP/RTP: Full compatibility with standard SIP trunking and VoIP hardware.
- WebRTC: Direct voice interaction for browsers and mobile applications.
- WebSocket: Native audio stream transmission for low-latency integrations.
Command/Event Pattern
Active Call uses an intuitive Command/Event pattern for its WebSocket control interface:
- Command: Clients send instructions to Active Call to control behaviors (e.g., dial, play, record, transfer).
- Event: Active Call pushes real-time status updates and processing results (e.g., ASR results, call status changes).
Plugin-based Architecture
Active Call features a plugin-based architecture that supports multiple mainstream service providers, giving you the flexibility to:
- Freely switch providers: Select service providers based on cost, performance, and feature requirements.
- Customize plugins: Implement your own ASR/TTS plugins to integrate self-deployed services.
Audio Processing Capabilities
Active Call includes a complete audio processing pipeline, delivering enterprise-level voice quality:
- Voice Activity Detection: Intelligently detects voice activity and notifies clients.
- Intelligent Noise Reduction: Removes background noise in real-time, improving ASR recognition accuracy.
- Gain Control: Automatically adjusts volume to ensure stable and clear speech.
Comparison with Pipecat
| Feature | Active Call | Pipecat/Monolithic Framework |
|---|---|---|
| Architecture Pattern | Decoupled architecture | Monolithic architecture |
| Deployment | Distributed deployment | Single process deployment |
| Learning Curve | Low, only requires understanding the API | High, requires understanding audio processing details |
| Debugging Difficulty | Low, problem domains are isolated | High, AI and media issues are tightly coupled |
| Performance | High-performance Rust implementation with multi-threaded parallelism | Single process with GIL limitations (Python) |
| Maintainability | Modular design with easy maintenance | High coupling, upgrades affect the entire stack |
Use Cases
- Enterprise applications and production environments.
- Systems requiring high concurrency and high availability.
- Large-scale projects with multi-team collaboration.
- Complex voice interaction scenarios (IVR, intelligent customer service, voice assistants).
- Applications with high performance and stability requirements.
- Systems requiring integration with existing telephone infrastructure.
Summary
- Low learning curve: No need to understand specialized knowledge such as audio processing, SIP protocols, or complex telephony hardware.
- High development efficiency: Focus on business logic and rapidly iterate AI features without worrying about media processing.
- Easy testing: WebSocket interface makes unit testing and integration testing straightforward.
- Highly scalable: Supports distributed deployment and horizontal scaling.
- Tech stack freedom: No restrictions on framework choices.
- Production ready: Built-in monitoring, logging, and error handling mechanisms.