Go SDK

Download Source Code

Download from the RustPBXGo GitHub repository:

git clone github.com/restsend/rustpbxgo

Directory Structure

rustpbxgo/
├── README.md           
├── client.go           # SDK core implementation
├── cmd/                # Example application
│   ├── main.go         # Program entry point
│   ├── llm.go          # Large model interaction logic
│   ├── media.go        # WebRTC media processing
│   └── webhook.go      # Webhook handling
├── go.mod             
└── go.sum            

client.go contains the definitions of core data structures (commands, events, and callback functions).

The cmd/ directory contains an application example with SIP/WebRTC calls, incoming call handling with webhooks, and large model interaction logic.

Client

The Client struct fields mainly include:

endpoint: RustPBX listening address.
id: Set the connection session ID, mainly used for answering calls.
OnXXX: Callback functions for handling events.

Flow Diagram

When the client calls the Connect method, it creates two goroutines (green parts in the diagram below).

One is responsible for reading and parsing WebSocket messages (top)
The other is responsible for processing messages and sending commands (bottom). When an event is received, it calls the corresponding callback function based on the event type.

Creating a Client

Use the NewClient function to create a client instance:

client := rustpbxgo.NewClient(endpoint, opts...)

Parameters:

Parameter	Type	Required	Description
`endpoint`	string	✅	RustPBX server address
`opts`	...ClientOption	❌	Optional configuration options

Options:

Option	Parameter Type	Description
`WithLogger(logger)`	`*logrus.Logger`	Set logger
`WithContext(ctx)`	`context.Context`	Set context for closing created goroutines
`WithID(id)`	`string`	Set session ID for answering calls
`WithDumpEvents(enable)`	`bool`	Enable event dumping

Example:

client := rustpbxgo.NewClient(
    "ws://localhost:8080",
    rustpbxgo.WithLogger(logger),
    rustpbxgo.WithContext(ctx),
    rustpbxgo.WithID("my-session-id"),
    rustpbxgo.WithDumpEvents(true),
)

Connecting to Server

Use the Connect method to connect to RustPBX:

err := client.Connect(callType)

Parameters:

Parameter	Type	Optional Values	Description
`callType`	string	`"sip"`, `"webrtc"`, `""`	Call type

Closing Client

Use the Shutdown method to close the client connection:

err := client.Shutdown()

Sending Commands

The client provides various methods for sending commands to the server.

Invite - Initiate Call

Initiate a call, returns AnswerEvent or error. See: Initiate Call.

answer, err := client.Invite(ctx, callOption)

Parameters:

Parameter	Type	Description
`ctx`	context.Context	Context for cancellation
`callOption`	CallOption	Call configuration, see CallOption

Return Values:

Type	Description
`*AnswerEvent`	Answer event (if successful)
`error`	Error information

Example:

// sip call
callOption := rustpbxgo.CallOption{
    Caller: "sip:alice@example.com",
    Callee: "sip:bob@example.com",
}

answer, err := client.Invite(ctx, callOption)
if err != nil {
    log.Fatalf("Call failed: %v", err)
}

Accept - Answer Incoming Call

Answer an incoming call. Used for answering incoming calls, see: Answer/Reject Incoming Call.

err := client.Accept(callOption)

Parameters:

Parameter	Type	Description
`callOption`	CallOption	Call configuration, see CallOption

info

The CallOption configuration for Accept is the same as Invite, except that the callee address does not need to be set.

Example:

cmd/webhook.go
	server := gin.Default()
	server.POST(prefix, func(c *gin.Context) {
		var form IncomingCall
		if err := c.ShouldBindJSON(&form); err != nil {
			c.JSON(400, gin.H{"error": err.Error()})
			return
		}

		client := createClient(parent, option, form.DialogID)

		go func() {
			ctx, cancel := context.WithCancel(parent)
			defer cancel()
			err := client.Connect("sip")
			if err != nil {
				option.Logger.Errorf("Failed to connect to server: %v", err)
			}
			defer client.Shutdown()

			client.Accept(option.CallOption)
			<-ctx.Done()
		}()

		c.JSON(200, gin.H{"message": "OK"})
	})
	server.Run(addr)

Ringing - Send Ringing

Send ringing response. Used for SIP calls, see: 180 Ringing.

err := client.Ringing(ringtone, recorder)

Parameters:

Parameter	Type	Required	Description
`ringtone`	string	❌	Ringtone URL
`recorder`	*RecorderOption	❌	Recording configuration, see RecorderOption

Example:

client.Ringing("http://example.com/ringtone.wav", recorder)

Reject - Reject Incoming Call

Reject an incoming call. Used for rejecting incoming calls, see: Answer/Reject Incoming Call.

err := client.Reject(reason)

Parameters:

Parameter	Type	Description
`reason`	string	Rejection reason

Example:

client.Reject("Busy")

TTS - Text-to-Speech

Convert text to speech and play, see: TTS (Text-to-Speech).

Parameters:

Parameter	Type	Required	Description
`text`	string	✅	Text to synthesize
`speaker`	string	❌	Voice
`playID`	string	❌	TTS Track identifier
`endOfStream`	bool	✅	Whether this is the last TTS command for the current playId
`autoHangup`	bool	✅	Whether to automatically hang up after TTS playback completes
`option`	*TTSOption	❌	TTS options, see TTSOption
`waitInputTimeout`	*uint32	❌	Maximum time to wait for user input (seconds)

tip

Set endOfStream = true to indicate that all TTS commands for the current playId have been sent. The TTS Track will exit after all command results finish playing and send a Track End event.
If playId is set, the Track End event sent by this TTS Track will include this playId.
- If the current playId is the same as a previous TTS command's playId, it will reuse the previous TTS Track; otherwise, it will terminate the previous TTS Track and create a new TTS Track.

For details, see TTS(Text-to-Speech)

StreamTTS - Streaming TTS

Convert text to speech and play (for LLM streaming output).

err := client.StreamTTS(text, speaker, playID, endOfStream, autoHangup, option, waitInputTimeout)

tip

The difference from TTS is that the corresponding TTS command has streaming = true, everything else is the same.

See Streaming TTS.

Play - Play Audio

Play audio file:

err := client.Play(url, autoHangup, waitInputTimeout)

Parameters:

Parameter	Type	Required	Description
`url`	string	✅	Audio file URL
`autoHangup`	bool	✅	Whether to automatically hang up after playback completes
`waitInputTimeout`	*uint32	❌	Wait for input timeout (seconds)

Interrupt - Interrupt Playback

Interrupt current TTS or audio playback:

err := client.Interrupt(graceful)

Parameters:

Parameter	Type	Required	Description
`graceful`	bool	✅	Whether to gracefully interrupt (wait for current TTS command to finish playing before exiting)

info

When TTS has not finished playing, will return Interruption event.
When graceful=true is set, the TTS Track will wait for the current TTS command to finish playing before exiting, otherwise it will exit immediately.
graceful only takes effect for non-streaming TTS (streaming=false).

See Interruption

Example:

client.OnSpeaking = func(event rustpbxgo.SpeakingEvent) {
    // Immediately interrupt TTS when user speaks
    client.Interrupt(false)
}

Hangup - Hangup Call

When the call is already established, use Hangup to end the call:

Parameters:

Parameter	Type	Required	Description
`reason`	string	❌	Hangup reason

Refer - Transfer Call

Transfer call to another target, used for transfer-to-human logic. See: Transfer Call

err := client.Refer(caller, callee, options)

Parameters:

Parameter	Type	Required	Description
`caller`	string	✅	Transfer caller SIP address
`callee`	string	✅	Transfer target SIP URI
`options`	*ReferOption	❌	Transfer options, see ReferOption

Mute - Mute

Mute all or specified Tracks:

err := client.Mute(trackID)

Parameters:

Parameter	Type	Description
`trackID`	*string	Track ID (if nil, mute all Tracks)

Example:

// Mute all Tracks
client.Mute(nil)

// Mute specific Track
trackID := "track-123"
client.Mute(&trackID)

Unmute - Unmute

Unmute all or specified Tracks:

err := client.Unmute(trackID)

Parameters:

Parameter	Type	Description
`trackID`	*string	Track ID (if nil, unmute all Tracks)

Event Callbacks

Client defines multiple fields (fields starting with On), used to set callback functions for events.

OnAnswer - Answer Callback

Trigger: Triggered when the call is answered and SDP negotiation is complete. See: AnswerEvent.

Purpose: Initialization operations after successful call. AnswerEvent contains SDP answer.

client.OnAnswer = func(event rustpbxgo.AnswerEvent) {
    log.Printf("Call answered: %s", event.TrackID)
    // Start sending welcome message
    client.TTS("Hello, welcome to call", "", "welcome", true, false, nil, nil)
}

OnReject - Reject Callback

Trigger: Triggered when the call is rejected. See RejectEvent.

Purpose: Handle rejection logic, record rejection reason (event.Reason) or perform follow-up processing.

client.OnReject = func(event rustpbxgo.RejectEvent) {
    log.Printf("Call rejected: %s", event.Reason)
    // Record rejection reason, clean up resources
}

OnRinging - Ringing Callback

Trigger: Triggered when the call is ringing (SIP calls). See RingingEvent.

Purpose: Monitor call progress, determine if early media (EarlyMedia) is available.

client.OnRinging = func(event rustpbxgo.RingingEvent) {
    log.Printf("Ringing, early media: %v", event.EarlyMedia)
}

OnHangup - Hangup Callback

Trigger: Triggered when the call ends. See HangupEvent.

Purpose: Clean up resources, save call records. Can get hangup reason (event.Reason), call duration (event.hangUpTime - event.startTime), caller/callee information (event.From,event.To), etc. from HangupEvent.

client.OnHangup = func(event rustpbxgo.HangupEvent) {
    log.Printf("Call ended: %s, initiator: %s", event.Reason, event.Initiator)
    // Save call record, clean up resources
}

OnSpeaking - Speaking Callback

Trigger: Triggered when VAD detects user starts speaking. See SpeakingEvent.

Purpose: Detect user input, commonly used to interrupt TTS playback. Can get speech start time (event.StartTime) from the event.

client.OnSpeaking = func(event rustpbxgo.SpeakingEvent) {
    log.Printf("User speaking detected, interrupt playback")
    // Immediately interrupt current TTS
    client.Interrupt(false)
}

OnSilence - Silence Callback

Trigger: Triggered when user stops speaking is detected. See SilenceEvent.

Purpose: Determine if user has finished speaking, can combine silence duration (event.Duration) to decide whether to start processing.

client.OnSilence = func(event rustpbxgo.SilenceEvent) {
    log.Printf("Silence detected, duration %d ms", event.Duration)
    // User might have finished speaking, prepare to process
}

OnAsrFinal - ASR Final Result Callback

Trigger: Triggered when speech recognition obtains stable result. See AsrFinalEvent.

Purpose: Get user's final speech input (event.Text), used for business logic processing or sending to LLM. Can distinguish different speech segments through sequence number (event.Index).

client.OnAsrFinal = func(event rustpbxgo.AsrFinalEvent) {
    log.Printf("User said: %s", event.Text)
    // Send user input to LLM for processing
    response := callLLM(event.Text)
    client.TTS(response, "", "reply", true, false, nil, nil)
}

OnAsrDelta - ASR Delta Result Callback

Trigger: Intermediate result during speech recognition process, content may change. See AsrDeltaEvent.

Purpose: Display recognition progress in real-time, improve user experience. Should not be used directly for business logic processing.

client.OnAsrDelta = func(event rustpbxgo.AsrDeltaEvent) {
    log.Printf("Recognizing: %s", event.Text)
    // Only for display, no business processing
}

OnTrackStart - Track Start Callback

Trigger: Triggered when a Track starts (RTP, TTS, file playback, etc.). See TrackStartEvent.

Purpose: Monitor audio playback start. For TTS Track, can get TTS command's playId via event.PlayID; for Play Track, can get the playback URL.

client.OnTrackStart = func(event rustpbxgo.TrackStartEvent) {
    log.Printf("Track started: %s, PlayID: %s", event.TrackID, event.PlayID)
}

OnTrackEnd - Track End Callback

Trigger: Triggered when a Track ends (RTP ends, TTS completes, file playback completes, etc.). See TrackEndEvent.

Purpose: Monitor audio playback end, can be used to control playback flow or clean up resources. Can get duration (event.Duration) and PlayID (event.PlayID) from the event.

client.OnTrackEnd = func(event rustpbxgo.TrackEndEvent) {
    log.Printf("Track ended: %s, duration: %d ms, PlayID: %s", 
        event.TrackID, event.Duration, event.PlayID)
    // TTS playback completed, can send next one
}

OnInterruption - Interruption Callback

Trigger: Triggered when Interrupt command is received and there are unfinished TTS. See InterruptionEvent.

Purpose: Get interruption information, such as played time (event.PlayedMs) and played text position (event.Subtitle, if provider supports subtitles).

client.OnInterruption = func(event rustpbxgo.InterruptionEvent) {
    if event.Subtitle != nil {
        log.Printf("Playback interrupted, played: %s", *event.Subtitle)
    }
    // Record interruption position for subsequent processing
}

OnDTMF - DTMF Callback

Trigger: Triggered when a keypress is detected. See DTMFEvent.

Purpose: Handle user keypress input, such as IVR menu selection. Can get keypress value (0-9, *, #, A-D) from event.Digit.

client.OnDTMF = func(event rustpbxgo.DTMFEvent) {
    log.Printf("User pressed: %s", event.Digit)
    // Handle keypress logic, such as menu navigation
    if event.Digit == "1" {
        client.TTS("You selected option 1", "", "menu", true, false, nil, nil)
    }
}

OnError - Error Callback

Trigger: Triggered when an error occurs. See ErrorEvent.

Purpose: Handle various error situations. From event.Sender you can know the error source (asr, tts, media, etc.), from event.Error get error information.

client.OnError = func(event rustpbxgo.ErrorEvent) {
    log.Printf("Error [%s]: %s", event.Sender, event.Error)
    // Log error, perform fallback handling
}

OnMetrics - Metrics Callback

Trigger: Triggered when performance metrics are collected. See MetricsEvent.

Purpose: Monitor performance metrics. Can get metric name (event.Key) and duration (event.Duration) from the event.

client.OnMetrics = func(event rustpbxgo.MetricsEvent) {
    log.Printf("Metric [%s]: %d ms", event.Key, event.Duration)
    // Record performance data for analysis
}

OnClose - Connection Close

Trigger: Triggered when WebSocket connection closes.

Purpose: Handle connection disconnection logic, clean up resources or attempt reconnection.

client.OnClose = func(reason string) {
    log.Printf("Connection closed: %s", reason)
    // Clean up resources or reconnect
}

OnEvent - Generic Event Handler

Trigger: Triggered when any event is received.

Purpose: Log all events or handle undefined special events. Receives raw event type (event) and JSON data (payload).

client.OnEvent = func(event string, payload string) {
    log.Printf("Event received [%s]: %s", event, payload)
    // Generic event logging or special handling
}

Options

CallOption

Call configuration options:

type CallOption struct {
    Denoise          bool
    Offer            string
    Callee           string
    Caller           string
    Recorder         *RecorderOption
    VAD              *VADOption
    ASR              *ASROption
    TTS              *TTSOption
    HandshakeTimeout string
    EnableIPv6       bool
    Sip              *SipOption
    Extra            map[string]string
}

Field Description:

Field	Type	Description
`Denoise`	bool	Whether to enable noise reduction
`Offer`	string	SDP offer string for WebRTC/SIP negotiation
`Callee`	string	Callee's SIP URI or phone number
`Caller`	string	Caller's SIP URI or phone number
`Recorder`	*RecorderOption	Call recording configuration, see RecorderOption
`VAD`	*VADOption	Voice activity detection configuration, see VADOption
`ASR`	*ASROption	Automatic Speech Recognition (ASR) configuration, see ASROption
`TTS`	*TTSOption	Text-to-Speech configuration, see TTSOption
`HandshakeTimeout`	string	Connection handshake timeout
`EnableIPv6`	bool	Enable IPv6 support
`Sip`	*SipOption	SIP registration account, password, and domain configuration, see SipOption
`Extra`	map[string]string	Additional parameters

RecorderOption

Recording configuration options:

type RecorderOption struct {
    RecorderFile string
    Samplerate   int
    Ptime        int
}

Field Description:

Field	Type	Default	Description
`RecorderFile`	string	-	Recording file path
`Samplerate`	int	`16000`	Sample rate (Hz)
`Ptime`	int	`200`	Packet time (milliseconds)

ASROption

Speech recognition configuration options:

type ASROption struct {
    Provider        string
    Model           string
    Language        string
    AppID           string
    SecretID        string
    SecretKey       string
    ModelType       string
    BufferSize      int
    SampleRate      uint32
    Endpoint        string
    Extra           map[string]string
    StartWhenAnswer bool
}

Field Description:

Field	Type	Description
`Provider`	string	ASR provider: `tencent`, `aliyun`, `Deepgram`, etc.
`Model`	string	Model name
`Language`	string	Language (e.g.: `zh-CN`, `en-US`), see corresponding provider documentation for details
`AppID`	string	Tencent Cloud's appId
`SecretID`	string	Tencent Cloud's secretId
`SecretKey`	string	Tencent Cloud's secretKey, or other provider's API Key
`ModelType`	string	ASR model type (e.g.: `16k_zh`, `8k_en`), see provider documentation for details
`BufferSize`	int	Audio buffer size, unit: bytes
`SampleRate`	uint32	Sample rate
`Endpoint`	string	Custom service endpoint URL
`Extra`	map[string]string	Provider-specific parameters
`StartWhenAnswer`	bool	Request ASR service after call is answered

TTSOption

Text-to-speech synthesis configuration options:

type TTSOption struct {
    Samplerate       int32
    Provider         string
    Speed            float32
    AppID            string
    SecretID         string
    SecretKey        string
    Volume           int32
    Speaker          string
    Codec            string
    Subtitle         bool
    Emotion          string
    Endpoint         string
    Extra            map[string]string
    WaitInputTimeout uint32
}

Field Description:

Field	Type	Description
`Samplerate`	int32	Sample rate, unit: Hz
`Provider`	string	TTS provider: `tencent`, `aliyun`, `deepgram`, `voiceapi`
`Speed`	float32	Speech rate
`AppID`	string	Tencent Cloud's appId
`SecretID`	string	Tencent Cloud's secretId
`SecretKey`	string	Tencent Cloud's secretKey, or other provider's API Key
`Volume`	int32	Volume (1-10)
`Speaker`	string	Voice, see provider documentation
`Codec`	string	Encoding format
`Subtitle`	bool	Whether to enable subtitles
`Emotion`	string	Emotion: `neutral`, `happy`, `sad`, `angry`, etc.
`Endpoint`	string	Custom TTS service endpoint URL
`Extra`	map[string]string	Provider-specific parameters
`WaitInputTimeout`	uint32	Maximum time to wait for user input (milliseconds)

VADOption

Voice activity detection configuration options:

type VADOption struct {
    Type                  string
    Samplerate            uint32
    SpeechPadding         uint64
    SilencePadding        uint64
    Ratio                 float32
    VoiceThreshold        float32
    MaxBufferDurationSecs uint64
    Endpoint              string
    SecretKey             string
    SecretID              string
    SilenceTimeout        uint
}

Field Description:

Field	Type	Default	Description
`Type`	string	`webrtc`	VAD algorithm type: `silero`, `ten`, `webrtc`
`Samplerate`	uint32	`16000`	Sample rate
`SpeechPadding`	uint64	`250`	Start detection `speechPadding` milliseconds after speech starts
`SilencePadding`	uint64	`100`	Silence event trigger interval, unit: milliseconds
`Ratio`	float32	`0.5`	Speech detection ratio threshold
`VoiceThreshold`	float32	`0.5`	Voice energy threshold
`MaxBufferDurationSecs`	uint64	`50`	Maximum buffer duration, unit: seconds
`Endpoint`	string	-	Custom VAD service endpoint
`SecretKey`	string	-	VAD service authentication key
`SecretID`	string	-	VAD service authentication ID
`SilenceTimeout`	uint	`5000`	Silence detection timeout, unit: milliseconds

SipOption

SIP configuration options:

type SipOption struct {
    Username string
    Password string
    Realm    string
    Headers  map[string]string
}

Field Description:

Field	Type	Description
`Username`	string	SIP username for authentication
`Password`	string	SIP password for authentication
`Realm`	string	SIP domain/realm for authentication
`Headers`	map[string]string	Additional SIP protocol headers (key-value pairs)

ReferOption

Transfer configuration options:

type ReferOption struct {
    Denoise     bool
    Timeout     uint32
    MusicOnHold string
    AutoHangup  bool
    Sip         *SipOption
    ASR         *ASROption
}

Field Description:

Field	Type	Description
`Denoise`	bool	Whether to enable noise reduction
`Timeout`	uint32	Timeout (seconds)
`MusicOnHold`	string	Hold music URL
`AutoHangup`	bool	Automatically hang up after transfer completes
`Sip`	*SipOption	SIP configuration
`ASR`	*ASROption	ASR configuration

Event Types

All event type definitions supported by Client:

Event

Base event structure, contains event type name.

type Event struct {
    Event string `json:"event"`
}

Field Description:

Field	Type	Description
`Event`	string	Event type name

IncomingEvent

Incoming call event, triggered when there is a new incoming call.

type IncomingEvent struct {
    TrackID   string `json:"trackId"`
    Timestamp uint64 `json:"timestamp"`
    Caller    string `json:"caller"`
    Callee    string `json:"callee"`
    Sdp       string `json:"sdp"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Caller`	string	Caller number
`Callee`	string	Callee number
`Sdp`	string	SDP offer string

AnswerEvent

Answer event, triggered when the call is answered and SDP negotiation is complete.

type AnswerEvent struct {
    TrackID   string `json:"trackId"`
    Timestamp uint64 `json:"timestamp"`
    Sdp       string `json:"sdp"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Sdp`	string	SDP answer string

RejectEvent

Reject event, triggered when the call is rejected.

type RejectEvent struct {
    TrackID   string `json:"trackId"`
    Timestamp uint64 `json:"timestamp"`
    Reason    string `json:"reason"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Reason`	string	Rejection reason

RingingEvent

Ringing event, triggered when the call is ringing (SIP calls).

type RingingEvent struct {
    TrackID    string `json:"trackId"`
    Timestamp  uint64 `json:"timestamp"`
    EarlyMedia bool   `json:"earlyMedia"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`EarlyMedia`	bool	Whether early media is available

HangupEvent

Hangup event, triggered when the call ends.

type HangupEventAttendee struct {
    Username string `json:"username"`
    Realm    string `json:"realm"`
    Source   string `json:"source"`
}

type HangupEvent struct {
    Timestamp   uint64               `json:"timestamp"`
    Reason      string               `json:"reason"`
    Initiator   string               `json:"initiator"`
    StartTime   string               `json:"startTime,omitempty"`
    HangupTime  string               `json:"hangupTime,omitempty"`
    AnswerTime  *string              `json:"answerTime,omitempty"`
    RingingTime *string              `json:"ringingTime,omitempty"`
    From        *HangupEventAttendee `json:"from,omitempty"`
    To          *HangupEventAttendee `json:"to,omitempty"`
    Extra       map[string]any       `json:"extra,omitempty"`
}

HangupEvent Field Description:

Field	Type	Description
`Timestamp`	uint64	Event timestamp (milliseconds)
`Reason`	string	Hangup reason
`Initiator`	string	Party that initiated the hangup
`StartTime`	string	Call start time
`HangupTime`	string	Hangup time
`AnswerTime`	*string	Answer time
`RingingTime`	*string	Ringing time
`From`	*HangupEventAttendee	Caller information
`To`	*HangupEventAttendee	Callee information
`Extra`	map[string]any	Additional information

HangupEventAttendee Field Description:

Field	Type	Description
`Username`	string	Username
`Realm`	string	Domain
`Source`	string	Source

SpeakingEvent

Speaking event, triggered when VAD detects user starts speaking.

type SpeakingEvent struct {
    TrackID   string `json:"trackId"`
    Timestamp uint64 `json:"timestamp"`
    StartTime uint64 `json:"startTime"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`StartTime`	uint64	Speech start time (milliseconds)

SilenceEvent

Silence event, triggered when user stops speaking is detected.

type SilenceEvent struct {
    TrackID   string `json:"trackId"`
    Timestamp uint64 `json:"timestamp"`
    StartTime uint64 `json:"startTime"`
    Duration  uint64 `json:"duration"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`StartTime`	uint64	Silence start time (milliseconds)
`Duration`	uint64	Silence duration (milliseconds)

EouEvent

End of utterance event, triggered when end of speech is detected.

type EouEvent struct {
    TrackID   string `json:"trackId"`
    Timestamp uint64 `json:"timestamp"`
    Complete  bool   `json:"complete"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Complete`	bool	Whether it ended completely

AsrFinalEvent

ASR final result event, triggered when speech recognition obtains stable result.

type AsrFinalEvent struct {
    TrackID   string  `json:"trackId"`
    Timestamp uint64  `json:"timestamp"`
    Index     uint32  `json:"index"`
    StartTime *uint64 `json:"startTime,omitempty"`
    EndTime   *uint64 `json:"endTime,omitempty"`
    Text      string  `json:"text"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Index`	uint32	Speech segment index
`StartTime`	*uint64	Speech start time (milliseconds)
`EndTime`	*uint64	Speech end time (milliseconds)
`Text`	string	Recognized text content

AsrDeltaEvent

ASR delta result event, intermediate result during speech recognition process, content may change.

type AsrDeltaEvent struct {
    TrackID   string  `json:"trackId"`
    Index     uint32  `json:"index"`
    Timestamp uint64  `json:"timestamp"`
    StartTime *uint64 `json:"startTime,omitempty"`
    EndTime   *uint64 `json:"endTime,omitempty"`
    Text      string  `json:"text"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Index`	uint32	Speech segment index
`Timestamp`	uint64	Event timestamp (milliseconds)
`StartTime`	*uint64	Speech start time (milliseconds)
`EndTime`	*uint64	Speech end time (milliseconds)
`Text`	string	Recognized text content (may change)

TrackStartEvent

Track start event, triggered when a Track starts (RTP, TTS, file playback, etc.).

type TrackStartEvent struct {
    TrackID   string  `json:"trackId"`
    Timestamp uint64  `json:"timestamp"`
    PlayId    *string `json:"playId,omitempty"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`PlayId`	*string	Play ID (TTS/Play command)

TrackEndEvent

Track end event, triggered when a Track ends (RTP ends, TTS completes, file playback completes, etc.).

type TrackEndEvent struct {
    TrackID   string  `json:"trackId"`
    Timestamp uint64  `json:"timestamp"`
    Duration  uint64  `json:"duration"`
    PlayId    *string `json:"playId,omitempty"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Duration`	uint64	Playback duration (milliseconds)
`PlayId`	*string	Play ID (TTS/Play command)

InterruptionEvent

Interruption event, triggered when Interrupt command is received and there are unfinished TTS.

type InterruptionEvent struct {
    TrackID       string  `json:"trackId"`
    Timestamp     uint64  `json:"timestamp"`
    Subtitle      *string `json:"subtitle,omitempty"`
    Position      *uint32 `json:"position,omitempty"`
    TotalDuration uint32  `json:"totalDuration"`
    Current       uint32  `json:"current"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Subtitle`	*string	Played subtitle text
`Position`	*uint32	Playback position (character count)
`TotalDuration`	uint32	Total duration (milliseconds)
`Current`	uint32	Current playback duration (milliseconds)

DTMFEvent

DTMF event, triggered when a keypress is detected.

type DTMFEvent struct {
    TrackID   string `json:"trackId"`
    Timestamp uint64 `json:"timestamp"`
    Digit     string `json:"digit"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Digit`	string	Keypress value (0-9, *, #, A-D)

AnswerMachineDetectionEvent

Answer machine detection event, triggered when an answer machine is detected.

type AnswerMachineDetectionEvent struct {
    Timestamp uint64 `json:"timestamp"`
    StartTime uint64 `json:"startTime"`
    EndTime   uint64 `json:"endTime"`
    Text      string `json:"text"`
}

Field Description:

Field	Type	Description
`Timestamp`	uint64	Event timestamp (milliseconds)
`StartTime`	uint64	Detection start time (milliseconds)
`EndTime`	uint64	Detection end time (milliseconds)
`Text`	string	Detected text

LLMFinalEvent

LLM final result event, triggered when large language model generates final result.

type LLMFinalEvent struct {
    Timestamp uint64 `json:"timestamp"`
    Text      string `json:"text"`
}

Field Description:

Field	Type	Description
`Timestamp`	uint64	Event timestamp (milliseconds)
`Text`	string	Final text generated by LLM

LLMDeltaEvent

LLM delta result event, triggered when large language model generates delta result.

type LLMDeltaEvent struct {
    Timestamp uint64 `json:"timestamp"`
    Word      string `json:"word"`
}

Field Description:

Field	Type	Description
`Timestamp`	uint64	Event timestamp (milliseconds)
`Word`	string	Delta word generated by LLM

MetricsEvent

Metrics event, triggered when performance metrics are collected.

type MetricsEvent struct {
    Timestamp uint64         `json:"timestamp"`
    Key       string         `json:"key"`
    Duration  uint32         `json:"duration"`
    Data      map[string]any `json:"data"`
}

Field Description:

Field	Type	Description
`Timestamp`	uint64	Event timestamp (milliseconds)
`Key`	string	Metric name
`Duration`	uint32	Duration (milliseconds)
`Data`	map[string]any	Metric data

ErrorEvent

Error event, triggered when an error occurs.

type ErrorEvent struct {
    TrackID   string  `json:"trackId"`
    Timestamp uint64  `json:"timestamp"`
    Sender    string  `json:"sender"`
    Error     string  `json:"error"`
    Code      *uint32 `json:"code,omitempty"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Sender`	string	Error source (asr, tts, media, etc.)
`Error`	string	Error information
`Code`	*uint32	Error code

AddHistoryEvent

Add history event, triggered when conversation history is added.

type AddHistoryEvent struct {
    Sender    string `json:"sender"`
    Timestamp uint64 `json:"timestamp"`
    Speaker   string `json:"speaker"`
    Text      string `json:"text"`
}

Field Description:

Field	Type	Description
`Sender`	string	Sender
`Timestamp`	uint64	Event timestamp (milliseconds)
`Speaker`	string	Speaker identifier
`Text`	string	Conversation text

OtherEvent

Other event, used to handle undefined event types.

type OtherEvent struct {
    TrackID   string            `json:"trackId"`
    Timestamp uint64            `json:"timestamp"`
    Sender    string            `json:"sender"`
    Extra     map[string]string `json:"extra,omitempty"`
}

Field Description:

Field	Type	Description
`TrackID`	string	Call track ID
`Timestamp`	uint64	Event timestamp (milliseconds)
`Sender`	string	Sender
`Extra`	map[string]string	Additional information

Complete Examples

SIP Call Example

package main

import (
    "context"
    "log"
    "os"
    "os/signal"
    "syscall"
    
    "github.com/restsend/rustpbxgo"
    "github.com/sirupsen/logrus"
)

func main() {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()
    
    logger := logrus.New()
    logger.SetLevel(logrus.InfoLevel)
    
    // Create client
    client := rustpbxgo.NewClient(
        "ws://localhost:8080",
        rustpbxgo.WithLogger(logger),
        rustpbxgo.WithContext(ctx),
    )
    
    // Set event handlers
    client.OnAnswer = func(event rustpbxgo.AnswerEvent) {
        logger.Info("Call answered")
        // Send welcome message
        client.TTS("Hello, welcome to call", "", "greeting", true, false, nil, nil)
    }
    
    client.OnAsrFinal = func(event rustpbxgo.AsrFinalEvent) {
        logger.Infof("User said: %s", event.Text)
        // Respond based on user input
        client.TTS("I received your message", "", "response", true, false, nil, nil)
    }
    
    client.OnHangup = func(event rustpbxgo.HangupEvent) {
        logger.Infof("Call ended: %s", event.Reason)
        cancel()
    }
    
    // Connect to server
    if err := client.Connect("sip"); err != nil {
        log.Fatalf("Connection failed: %v", err)
    }
    defer client.Shutdown()
    
    // Configure call
    callOption := rustpbxgo.CallOption{
        Caller: "sip:1000@example.com",
        Callee: "sip:2000@example.com",
        Denoise: true,
        Sip: &rustpbxgo.SipOption{
            Username: "user",
            Password: "pass",
            Realm: "example.com",
        },
        ASR: &rustpbxgo.ASROption{
            Provider: "tencent",
            Language: "zh-CN",
        },
        TTS: &rustpbxgo.TTSOption{
            Provider: "tencent",
            Speaker: "xiaoyan",
        },
        VAD: &rustpbxgo.VADOption{
            Type: "webrtc",
            SilenceTimeout: 5000,
        },
    }
    
    // Initiate call
    _, err := client.Invite(ctx, callOption)
    if err != nil {
        log.Fatalf("Call failed: %v", err)
    }
    
    // Wait for signal
    sigChan := make(chan os.Signal, 1)
    signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
    
    select {
    case <-ctx.Done():
        logger.Info("Call ended")
    case <-sigChan:
        logger.Info("Interrupt signal received")
        client.Hangup("user_interrupt")
    }
}

Answer Incoming Call Example

package main

import (
    "context"
    "encoding/json"
    "log"
    "net/http"
    
    "github.com/restsend/rustpbxgo"
    "github.com/sirupsen/logrus"
)

type WebhookRequest struct {
    DialogID string `json:"dialogId"`
    Caller   string `json:"caller"`
    Callee   string `json:"callee"`
}

func main() {
    logger := logrus.New()
    
    // Set up Webhook handler
    http.HandleFunc("/webhook", func(w http.ResponseWriter, r *http.Request) {
        var req WebhookRequest
        if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
            http.Error(w, err.Error(), http.StatusBadRequest)
            return
        }
        
        logger.Infof("Incoming call: %s -> %s", req.Caller, req.Callee)
        
        // Handle incoming call
        go handleIncomingCall(req.DialogID, req.Caller, req.Callee, logger)
        
        w.WriteHeader(http.StatusOK)
    })
    
    log.Fatal(http.ListenAndServe(":8090", nil))
}

func handleIncomingCall(dialogID, caller, callee string, logger *logrus.Logger) {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()
    
    // Create client with dialogID
    client := rustpbxgo.NewClient(
        "ws://localhost:8080",
        rustpbxgo.WithLogger(logger),
        rustpbxgo.WithContext(ctx),
        rustpbxgo.WithID(dialogID),
    )
    
    client.OnHangup = func(event rustpbxgo.HangupEvent) {
        logger.Info("Call ended")
        cancel()
    }
    
    // Connect to server
    if err := client.Connect("sip"); err != nil {
        logger.Errorf("Connection failed: %v", err)
        return
    }
    defer client.Shutdown()
    
    // Send ringing
    recorder := &rustpbxgo.RecorderOption{
        RecorderFile: "/recordings/" + dialogID + ".wav",
        Samplerate: 16000,
    }
    client.Ringing("", recorder)
    
    // Answer incoming call
    callOption := rustpbxgo.CallOption{
        Caller: caller,
        Callee: callee,
        ASR: &rustpbxgo.ASROption{
            Provider: "tencent",
            Language: "zh-CN",
        },
        TTS: &rustpbxgo.TTSOption{
            Provider: "tencent",
            Speaker: "xiaoyan",
        },
    }
    
    if err := client.Accept(callOption); err != nil {
        logger.Errorf("Answer failed: %v", err)
        return
    }
    
    // Send welcome message
    client.TTS("Hello, I am an intelligent assistant", "", "greeting", true, false, nil, nil)
    
    // Wait for call to end
    <-ctx.Done()
}

Streaming TTS Example (LLM Integration)

package main

import (
    "bufio"
    "context"
    "encoding/json"
    "log"
    "net/http"
    
    "github.com/restsend/rustpbxgo"
    "github.com/sirupsen/logrus"
)

func streamLLMResponse(client *rustpbxgo.Client, userInput string) {
    // Simulate calling LLM API
    req, _ := http.NewRequest("POST", "https://api.openai.com/v1/chat/completions", nil)
    req.Header.Set("Content-Type", "application/json")
    
    resp, err := http.DefaultClient.Do(req)
    if err != nil {
        log.Printf("LLM request failed: %v", err)
        return
    }
    defer resp.Body.Close()
    
    playID := "llm-stream"
    scanner := bufio.NewScanner(resp.Body)
    
    for scanner.Scan() {
        line := scanner.Text()
        
        // Parse SSE data
        var data map[string]interface{}
        if err := json.Unmarshal([]byte(line), &data); err != nil {
            continue
        }
        
        // Get delta text
        if content, ok := data["content"].(string); ok && content != "" {
            // Send streaming TTS
            isEnd := data["finish_reason"] != nil
            client.StreamTTS(content, "", playID, isEnd, false, nil, nil)
        }
    }
}

func main() {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()
    
    logger := logrus.New()
    
    client := rustpbxgo.NewClient(
        "ws://localhost:8080",
        rustpbxgo.WithLogger(logger),
        rustpbxgo.WithContext(ctx),
    )
    
    client.OnAsrFinal = func(event rustpbxgo.AsrFinalEvent) {
        logger.Infof("User input: %s", event.Text)
        
        // Interrupt current playback
        client.Interrupt()
        
        // Stream response
        go streamLLMResponse(client, event.Text)
    }
    
    // ... other code
}

Best Practices

Error Handling

Always check errors and handle them appropriately:

if err := client.Connect("sip"); err != nil {
    log.Fatalf("Connection failed: %v", err)
}

if err := client.TTS("Hello", "", "1", true, false, nil, nil); err != nil {
    log.Printf("TTS failed: %v", err)
    // Retry or use fallback
}

Resource Cleanup

Ensure proper resource cleanup:

defer client.Shutdown()

Context Management

Use context to control lifecycle:

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()

client := rustpbxgo.NewClient(
    endpoint,
    rustpbxgo.WithContext(ctx),
)

Logging

Use appropriate log levels:

logger := logrus.New()
logger.SetLevel(logrus.InfoLevel) // Production environment
// logger.SetLevel(logrus.DebugLevel) // Development environment

Event Handling

Avoid long-running operations in event handlers, use goroutines:

client.OnAsrFinal = func(event rustpbxgo.AsrFinalEvent) {
    go func() {
        // Long-running operation
        response := processWithLLM(event.Text)
        client.TTS(response, "", "reply", true, false, nil, nil)
    }()
}

Troubleshooting

Connection Issues

If unable to connect to server:

Check if endpoint URL is correct
Confirm server is running
Check firewall settings
Enable debug logging to see detailed information

logger.SetLevel(logrus.DebugLevel)
client := rustpbxgo.NewClient(
    endpoint,
    rustpbxgo.WithLogger(logger),
    rustpbxgo.WithDumpEvents(true),
)

ASR Not Working

If ASR cannot recognize speech:

Confirm ASR configuration is correct
Check if API keys are valid
Verify sample rate settings
Check VAD configuration

TTS No Sound

If TTS has no sound:

Check if TTS configuration is correct
Verify speaker parameter
Confirm endOfStream is set correctly
Check network connection

More Resources

Reference

For complete example code, see: /Users/yangli/Desktop/rustpbxgo/cmd

Download Source Code​

Directory Structure​

Client​

Flow Diagram​

Creating a Client​

Connecting to Server​

Closing Client​

Sending Commands​

Invite - Initiate Call​

Accept - Answer Incoming Call​

Ringing - Send Ringing​

Reject - Reject Incoming Call​

TTS - Text-to-Speech​

StreamTTS - Streaming TTS​

Play - Play Audio​

Interrupt - Interrupt Playback​

Hangup - Hangup Call​

Refer - Transfer Call​

Mute - Mute​

Unmute - Unmute​

Event Callbacks​

OnAnswer - Answer Callback​

OnReject - Reject Callback​

OnRinging - Ringing Callback​

OnHangup - Hangup Callback​

OnSpeaking - Speaking Callback​

OnSilence - Silence Callback​

OnAsrFinal - ASR Final Result Callback​

OnAsrDelta - ASR Delta Result Callback​

OnTrackStart - Track Start Callback​

OnTrackEnd - Track End Callback​

OnInterruption - Interruption Callback​

OnDTMF - DTMF Callback​

OnError - Error Callback​

OnMetrics - Metrics Callback​

OnClose - Connection Close​

OnEvent - Generic Event Handler​

Options​

CallOption​

RecorderOption​

ASROption​

TTSOption​

VADOption​

SipOption​

ReferOption​

Event Types​

Event​

IncomingEvent​

AnswerEvent​

RejectEvent​

RingingEvent​

HangupEvent​

SpeakingEvent​

SilenceEvent​

EouEvent​

AsrFinalEvent​

AsrDeltaEvent​

TrackStartEvent​

TrackEndEvent​

InterruptionEvent​

DTMFEvent​

AnswerMachineDetectionEvent​

LLMFinalEvent​

LLMDeltaEvent​

MetricsEvent​

ErrorEvent​

AddHistoryEvent​

OtherEvent​

Complete Examples​

SIP Call Example​

Answer Incoming Call Example​

Streaming TTS Example (LLM Integration)​

Best Practices​

Error Handling​

Resource Cleanup​

Context Management​

Logging​

Event Handling​

Troubleshooting​

Connection Issues​

Download Source Code

Directory Structure

Client

Flow Diagram

Creating a Client

Connecting to Server

Closing Client

Sending Commands

Invite - Initiate Call

Accept - Answer Incoming Call

Ringing - Send Ringing

Reject - Reject Incoming Call

TTS - Text-to-Speech

StreamTTS - Streaming TTS

Play - Play Audio

Interrupt - Interrupt Playback

Hangup - Hangup Call

Refer - Transfer Call

Mute - Mute

Unmute - Unmute

Event Callbacks

OnAnswer - Answer Callback

OnReject - Reject Callback

OnRinging - Ringing Callback

OnHangup - Hangup Callback

OnSpeaking - Speaking Callback

OnSilence - Silence Callback

OnAsrFinal - ASR Final Result Callback

OnAsrDelta - ASR Delta Result Callback

OnTrackStart - Track Start Callback

OnTrackEnd - Track End Callback

OnInterruption - Interruption Callback

OnDTMF - DTMF Callback

OnError - Error Callback

OnMetrics - Metrics Callback

OnClose - Connection Close

OnEvent - Generic Event Handler

Options

CallOption

RecorderOption

ASROption

TTSOption

VADOption

SipOption

ReferOption

Event Types

Event

IncomingEvent

AnswerEvent

RejectEvent

RingingEvent

HangupEvent

SpeakingEvent

SilenceEvent

EouEvent

AsrFinalEvent

AsrDeltaEvent

TrackStartEvent

TrackEndEvent

InterruptionEvent

DTMFEvent

AnswerMachineDetectionEvent

LLMFinalEvent

LLMDeltaEvent

MetricsEvent

ErrorEvent

AddHistoryEvent

OtherEvent

Complete Examples

SIP Call Example

Answer Incoming Call Example

Streaming TTS Example (LLM Integration)

Best Practices

Error Handling

Resource Cleanup

Context Management

Logging

Event Handling

Troubleshooting

Connection Issues