Circuit Breaker & Monitoring
1. Tri-State Circuit Breaker
Each carrier trunk maintains an independent circuit breaker state:
N consecutive failures
Closed ─────────────► Open
▲ │
│ │ Wait cooldown
│ Probe succeeds ▼
│ HalfOpen
└──────────────────────┘
Probe fails → back to Open
| State | Behavior |
|---|---|
| Closed | Normal routing, counts failures |
| Open | Rejects routing, no calls sent to this trunk |
| HalfOpen | Allows a small number of probe calls; closes on success, reopens on failure |
1.1 Configuration Parameters
| Parameter | Description | Default |
|---|---|---|
| failure_threshold | Consecutive failures to trigger circuit open | 5 |
| open_duration_secs | Duration the circuit stays open | 30 |
| half_open_probes | Number of probes in half-open state | 1 |
| failure_codes | SIP codes counted as failures | [503, 408, 504] |
Enable/disable per trunk in the console under Wholesale → Trunk Configuration.
1.2 Operations
Wholesale → Monitoring & Throttling:
- View current state of all trunks (Closed/Open/HalfOpen)
- Manual reset: force an Open state back to Closed
- Enable/disable: temporarily disable a trunk’s circuit breaker
2. Sliding Window ASR/ACD Monitoring
2.1 Metric Definitions
| Metric | Formula | Description |
|---|---|---|
| ASR | Answered calls / Total calls × 100% | Answer Seizure Ratio |
| ACD | Total talk time / Answered calls | Average Call Duration |
| Ring ASR | Answered calls / (Total calls - User hangups) | Excludes voluntary user hangups |
2.2 Configuration
[proxy.wholesale]
sliding_window_enabled = true
sliding_window_secs = 300 # 5-minute window
sliding_window_max_events = 10000
asr_alert_threshold = 30.0 # Alert when ASR drops below 30%
min_calls_for_stats = 10 # Don't calculate with fewer than 10 calls
2.3 Alerts
When a trunk’s ASR drops below the threshold:
- The Monitoring & Throttling page highlights it in red
- The circuit breaker may trigger automatically (if enabled)
- Prometheus metrics can trigger external alerts
3. Prometheus Metrics
Wholesale exposes the following metrics at the /metrics endpoint:
| Metric Name | Type | Labels | Description |
|---|---|---|---|
wholesale_calls_total | Counter | tenant, carrier, status | Total calls |
wholesale_call_duration_seconds | Histogram | tenant, carrier | Call duration distribution |
wholesale_revenue_microcurrency_total | Counter | tenant | Cumulative revenue |
wholesale_cost_microcurrency_total | Counter | carrier | Cumulative cost |
wholesale_profit_microcurrency_total | Counter | tenant, carrier | Cumulative profit |
wholesale_routing_attempts_total | Counter | tenant | Routing attempts |
wholesale_routing_no_routes_total | Counter | tenant | No-route occurrences |
wholesale_concurrent_limit_rejected_total | Counter | tenant | Concurrency limit rejections |
wholesale_cps_limit_rejected_total | Counter | tenant | CPS limit rejections |
wholesale_circuit_breaker_state | Gauge | trunk | Circuit breaker state |
4. Route Result Cache
For performance, Wholesale can cache routing resolution results:
[proxy.wholesale]
route_cache_capacity = 10000 # LRU max entries
route_cache_ttl_secs = 30 # Cache for 30 seconds
Cache Key: (tenant_id, caller, callee)
- New calls check the cache first; on hit, the cached result is used directly
- Cache is automatically cleared when rates/routes are reloaded