Code Explanation
In the previous chapter, we have successfully run a voice assistant.
This chapter will explain the code in detail to help readers understand how to use the RustPBX SDK.
Project address: https://github.com/restsend/rustpbxgo
Directory Structure
rustpbxgo/
├── README.md
├── client.go # SDK core definition
├── cmd/ # Example application
│ ├── main.go # Program entry
│ ├── llm.go # Large model interaction
│ ├── media.go # WebRTC
│ └── webhook.go # WebHook handling
├── go.mod
└── go.sum
client.go: Includes RustPBX Go core data structureClientdefinition and its methods.cmd/directory: Voice assistant example code, including SIP/WebRTC calls, incoming call handling with webhooks, and large model interaction logic.
client.go - Client Definition
Connect to RustPBX
NewClient: Create clientendpoint: specify RustPBX server address
Connect: Connect to servercallType: specify call type (here we use"webrtc")
Shutdown: Close client
Send Commands
RustPBXGo sends corresponding commands to RustPBX by calling methods.
Commands used here include:
Invite: Initiate call- For WebRTC calls, need to set
offerinCallOptionas the SDP offer of the call target
- For WebRTC calls, need to set
TTS: Call text-to-speech service and play resultPlay: Play audio file from URLInterrupt: Interrupt current playback (TTS and file playback)Hangup: Hang up
Register Callback Functions
During the entire process from call establishment to termination, various events are triggered, as shown in the diagram:
Handle corresponding events by setting callback functions.
For example, to handle AsrFinal events (stable speech recognition results), you can set a callback function in the OnAsrFinal field.
Flow Chart
When the client calls the Connect method, it creates two goroutines (green parts in the diagram below):
-
One is responsible for reading WebSocket messages and parsing (top)
-
One is responsible for processing messages and sending commands (bottom), calls
processEventmethod when receiving events, and calls corresponding callback functions based on event type.
In the next section, we will see how to use these APIs to send commands and handle events.
main.go - Example Application Entry
main.go is the program entry, main logic includes:
Create WebRTCPeer
Create WebRTCPeer and generate SDP offer, then write the SDP offer into the callOption.Offer field
localSdp, err := mediaHandler.Setup(codec, iceServers)
if err != nil {
logger.Fatalf("Failed to get local SDP: %v", err)
}
logger.Infof("Offer SDP: %v", localSdp)
callOption.Offer = localSdp
WebRTC calls need to set callOption.Offer, SIP calls need to set callOption.Caller and callOption.Callee.
See:
Create Client, and Register Callback Functions
Use the NewClient function to create a client. The main parameter here is option.Endpoint to set the RustPBX server address.
Here we mainly handle AsrFinal events, set a callback function in the OnAsrFinal field.
Here we call the LLMHandler.QueryStream method to input the recognition result event.Text into the large model.
client.OnAsrFinal = func(event rustpbxgo.AsrFinalEvent) {
...
response, err := option.LLMHandler.QueryStream(option.OpenaiModel, event.Text, option.StreamingTTS, client, option.ReferCaller)
...
}
Connect/Disconnect RustPBX
err = client.Connect(callType)
if err != nil {
logger.Fatalf("Failed to connect to server: %v", err)
}
defer client.Shutdown()
Here we use webrtc calls, so the callType parameter is "webrtc".
Send Welcome Message Using TTS Command
client.TTS(greeting, "", "1", true, false, nil, nil)
Here the greeting parameter value is read from the command line, used as the text parameter of the TTS command, which is the text to be converted.
llm.go - Large Model Interaction
llm.go is responsible for interacting with large models, including defining tools, making requests, and handling responses.
Responses include text and tool calls. If it's text, we will play it through TTS commands.
Here we choose Go OpenAI as the large model client.
Define Tools (Tools)
What is Tool Calling?
Tool Calling (also called Function Calling) allows LLM to call externally defined functions when needed. For example:
We wrap the Hangup and Refer commands as tools.
- When the user says "transfer me to human", the LLM will call the
Refertool to implement transfer - When the user says "goodbye", the LLM will call the
Hanguptool to implement hangup
Here we define two tools: hangup and refer:
var hangupDefinition = openai.FunctionDefinition{
Name: "hangup",
Description: "End the conversation and hang up the call",
Parameters: json.RawMessage(`{
"type": "object",
"properties": {
"reason": {
"type": "string",
"description": "Reason for hanging up the call"
}
},
"required": []
}`),
}
var referDefinition = openai.FunctionDefinition{
Name: "refer",
Description: "Refer the call to another target",
Parameters: json.RawMessage(`{
"type": "object",
"properties": {
},
"required": []
}`),
}
When the large model's response contains tool calls, we will call the Hangup and Refer methods respectively.
These two methods will then send Hangup and Refer commands to RustPBX.
Make Request
Here we use the CreateChatCompletionStream method to make requests.
stream, err := h.client.CreateChatCompletionStream(h.ctx, request)
if err != nil {
return "", err
}
defer stream.Close()
Since we set Stream: true in the request, it will return a streaming response.
request := openai.ChatCompletionRequest{
Model: model,
Messages: h.messages,
Temperature: 0.7,
Stream: true,
}
Streaming responses are implemented through Server-Sent Events technology. Each call to the stream.Recv() method returns partial results.
Handle Response
When LLM decides to call a tool, the result will contain toolCall.
We determine which tool to call based on the toolCall.Function.Name field, then call the corresponding tool.
if len(response.Choices) > 0 && len(response.Choices[0].Delta.ToolCalls) > 0 {
for _, toolCall := range response.Choices[0].Delta.ToolCalls {
if toolCall.Function.Name == "hangup" {
err := tools.HandleHangup("LLM requested hangup");
}
if toolCall.Function.Name == "refer" {
err := tools.HandleRefer();
}
}
}
For final text responses, we play them through TTS commands.
for {
response, err := stream.Recv()
if err == io.EOF {
break // Stream ended
}
// Get text content
content := response.Choices[0].Delta.Content
if content != "" {
fullResponse += content
// Immediately send to TTS (streaming playback)
if isFinished {
ttsWriter.Write(content, true, false) // endOfStream=true, last segment
} else {
ttsWriter.Write(content, false, false) // endOfStream=false, intermediate segment
}
}
}
text: Text segment to playendOfStream: Whether it's the last segmentautoHangup: Whether to automatically hang up after playback completes
Summary
In this chapter, we introduced the code structure of RustPBX Go.
And how to connect to RustPBX, send commands, handle events, and briefly introduced large model interaction and tool calling.
Next chapter we will introduce how to add custom tools to implement a weather query Agent.