Skip to main content

Introduction to RustPBX

RustPBX is a high-performance, memory-safe PBX (Private Branch Exchange) platform developed in Rust.

Decoupled Architecture

RustPBX
RustPBX
UserAgent
UserAgent
Media Engine
Media Engine
TTS
TTS
ASR
ASR
VAD
VAD
Phone
Phone
Web
Web
LLM
LLM
SDK
SDK
WebRTC
WebRTC
SIP
SIP
Event
Event
Command
Command
WebSocket
WebSocket
AI Agent
AI Agent
Text is not SVG - cannot display

RustPBX features an innovative decoupled architecture that completely separates media processing from business logic:

  • RustPBX: Handles media stream transmission and audio processing
  • AI Agent: Manages business logic and LLM integration
  • WebSocket Connection: Enables interaction with clients through a Command/Event pattern

This architecture enables developers to:

  • Focus on business logic: No need to understand low-level details such as audio processing, SIP protocols, and similar technical complexities
  • Multi-language support: Build your AI agent using any programming language (Python, Go, Java, JavaScript, Rust, etc.)
  • Tech stack freedom: Choose your preferred AI frameworks (LangChain, OpenAI SDK, etc.) without restrictions
  • Independent deployment: Deploy and scale AI logic and media processing components separately
  • Simplified debugging: Isolated problem domains ensure AI and media issues don't interfere with each other

Protocol Support

RustPBX supports multiple industry-standard communication protocols:

  • SIP/RTP: Full compatibility with standard SIP signaling and RTP media transmission protocols
  • WebRTC: Direct call support for browser and mobile devices
  • WebSocket: Native WebSocket audio stream transmission

Command/Event Pattern

RustPBX uses an intuitive Command/Event pattern for client-server interaction:

  • Command: Send instructions to RustPBX to control call behavior and operations
  • Event: RustPBX pushes status changes and command processing results in real-time
RustPBX
RustPBX
UserAgent
UserAgent
Audio
Audio
Phone
Phone
SDK
SDK
Media Engine
Media Engine
TTS Command
TTS Command
ASR Event

ASR Event
LLM
LLM
Agent
Agent
Text is not SVG - cannot display

Plugin-based Architecture

RustPBX features a plugin-based architecture that supports multiple mainstream service providers, giving you the flexibility to:

  • Freely switch providers: Select service providers based on cost, performance, and feature requirements
  • Customize plugins: Implement your own ASR/TTS plugins to integrate self deployed services
RustPBX
RustPBX
UserAgent
UserAgent
Media Engine
Media Engine
TTS
TTS
ASR
ASR
VAD
VAD
Aliyun
Aliyun
Tencent
Tencent
Deepgram
Deepgram
Cloud Service
Cloud Service
Text is not SVG - cannot display

Audio Processing Capabilities

RustPBX includes complete audio processing pipeline, delivering enterprise-level voice quality:

Audio
Audio
Noise Reduction
Noise Reduction
Voice Activity
Deteaction
Voice Activity...
Automatic
Gain Control
Automatic...
Audio Processing Pipeline
Audio Processing Pipeline
Text is not SVG - cannot display
  • Voice Activity Detection: Intelligently detects voice activity and notifies clients
  • Intelligent Noise Reduction: Removes background noise in real-time, improving ASR recognition accuracy
  • Gain Control: Automatically adjusts volume to ensure stable and clear speech

Comparison with Pipecat

FeatureRustPBXPipecat/Monolithic Framework
Architecture PatternDecoupled architectureMonolithic architecture
DeploymentDistributed deploymentSingle process deployment
Learning CurveLow, only requires understanding the APIHigh, requires understanding audio processing details
Debugging DifficultyLow, problem domains are isolatedHigh, AI and media issues are tightly coupled
PerformanceHigh-performance Rust implementation with multi-threaded parallelismSingle process with GIL limitations (Python)
MaintainabilityModular design with easy maintenanceHigh coupling, upgrades affect the entire stack

Use Cases

  • Enterprise applications and production environments
  • Systems requiring high concurrency and high availability
  • Large-scale projects with multi-team collaboration
  • Complex voice interaction scenarios (IVR, intelligent customer service, voice assistants)
  • Applications with high performance and stability requirements
  • Systems requiring integration with existing telephone infrastructure

Summary

  1. Low learning curve: No need to understand specialized knowledge such as audio processing, SIP protocols, and related technical details
  2. High development efficiency: Focus on business logic and rapidly iterate AI features without worrying about media processing
  3. Easy testing: WebSocket interface makes unit testing and integration testing straightforward
  4. Highly scalable: Supports distributed deployment and horizontal scaling for enterprise needs
  5. Tech stack freedom: No restrictions on framework choices
  6. Production ready: Built-in monitoring, logging, and error handling mechanisms