Cover Page
Backend Page
Rust with axum
We assume that your chatterd code base has accumulated code from at least the llmPrompt,
chatter backend tutorials. If you’ve, in addition, also accumulated code from latter tutorials, that’s fine.
Cargo.toml: more dependencies
Change to your chatterd folder and edit the file Cargo.toml
to add the 3rd-party library we will be using.
server$ cd ~/reactive/chatterd
server$ vi Cargo.toml
In Cargo.toml, add the following lines below the existing [dependencies] tag:
async-stream = "0.3.6" # may already be there if ndjson_yield_sse used
indexmap = { version = "2.12.0", features = ["serde"] }
If you haven’t done llmChat tutorial or llmPlay project, add also:
futures = "0.3.31"
regex = "1.12.2"
and add “stream” to your reqwest line:
reqwest = { version = "0.12.24", features = ["json", "stream"] }
toolbox
Let us start by creating a toolbox to hold our tools. Create a new Rust file,
name it toolbox.rs:
server$ vi src/toolbox.rs
Put the following use imports at the top of the file:
use futures::future::{BoxFuture, FutureExt};
use indexmap::IndexMap;
use reqwest::Client;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::sync::LazyLock;
The contents of this file can be categorized into three purposes: tool/function definition, the toolbox itself, and tool use (or function calling).
Tool/function definition
Ollama tool schema: at the top of Ollama’s JSON tool definition is a JSON Object respresenting a tool schema. The tool schema is defined using nested JSON Objects and JSON Arrays. Add the full nested definitions of Ollama’s tool schema to your file:
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct OllamaToolSchema {
#[serde(rename = "type")]
type_: String,
function: OllamaToolFunction,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
struct OllamaToolFunction {
name: String,
description: String,
#[serde(skip_serializing_if = "Option::is_none")]
parameters: Option<OllamaFunctionParams>,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
struct OllamaFunctionParams {
#[serde(rename = "type")]
type_: String,
properties: HashMap<String, OllamaParamProp>,
#[serde(skip_serializing_if = "Vec::is_empty")]
required: Vec<String>,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
struct OllamaParamProp {
#[serde(rename = "type")]
type_: String,
description: String,
#[serde(rename = "enum")]
#[serde(skip_serializing_if = "Option::is_none")]
enum_: Option<Vec<String>>, // `enum` is keyword in Rust
}
Weather tool schema: in this tutorial, we have only one tool resident in the backend. Add the following tool definition to your file:
// Global constant variables in rust require static allocation,
// but HashMap must be dynamically allocated on the heap, hence
// LazyLock, which is lazily allocated.
const WEATHER_TOOL: LazyLock<OllamaToolSchema> = LazyLock::new(|| OllamaToolSchema {
type_: String::from("function"),
function: OllamaToolFunction {
name: String::from("get_weather"),
description: String::from("Get current temperature"),
parameters: Some(OllamaFunctionParams {
type_: String::from("object"),
properties: HashMap::from([
(
String::from("latitude"),
OllamaParamProp {
type_: String::from("string"),
description: String::from("latitude of location of interest"),
enum_: None,
},
),
(
String::from("longitude"),
OllamaParamProp {
type_: String::from("string"),
description: String::from("longitude of location of interest"),
enum_: None,
},
),
]),
required: vec![String::from("latitude"), String::from("longitude")],
}),
},
});
Weather tool function: we implement the get_weather tool as a getWeather() function that makes an API call to the free Open Meteo weather service. Add the following nested struct definition to hold Open Meteo’s return result. For this tutorial, we’re
only interested in the latitude, longitude, and temperature returned by Open Meteo:
#[derive(Deserialize)]
struct Current {
#[serde(rename = "temperature_2m")]
temp: f64,
}
#[derive(Deserialize)]
struct OMeteoResponse {
latitude: f64,
longitude: f64,
current: Current,
}
Here’s the definition of the getWeather() function:
pub async fn getWeather(client: &Client, argv: &Vec<String>) -> Result<Option<String>, String> {
// Open-Meteo API doc: https://open-meteo.com/en/docs#api_documentation
match client
.get(format!("https://api.open-meteo.com/v1/forecast?latitude={}&longitude={}¤t=temperature_2m&temperature_unit=fahrenheit",
argv[0], argv[1]))
.send().await {
Ok(response) => {
let ometeoResponse: OMeteoResponse = response.json().await.unwrap();
Ok(Some(format!("Weather at lat: {}, lon: {} is {}ºF",
ometeoResponse.latitude, ometeoResponse.longitude, ometeoResponse.current.temp)))
},
Err(err) => {
Err(format!("Open-meteo: {}", err))
}
}
}
The toolbox
Even though we have only one resident tool in this tutorial, we want a generalized architecture that can hold multiple tools and invoke the right tool dynamically. To that end, we’ve chosen to use a switch table (or jump table or, more fancily, service locator registry) as the data structure for our tool box. We implement the switch table as a dictionary. The “keys” in the dictionary are the names of the tools/functions. Each “value” is a record containing the tool’s definition/schema and a pointer to the function implementing the tool. To send a tool as part of a request to Ollama, we look up its schema in the switch table and copy it to the request. To invoke a tool called by Ollama in its response, we look up the tool’s function in the switch table and invoke the function.
Add the following type for an async tool function and the record type containing a tool definition and the async tool function:
type ToolFunction = Box<
dyn Fn(&Client, &Vec<String>) -> BoxFuture<'static, Result<Option<String>, String>> + Send + Sync,
>;
pub struct Tool {
pub(crate) schema: OllamaToolSchema,
function: ToolFunction,
}
Now create a switch-table toolbox and put the WEATHER_TOOL in it:
pub const TOOLBOX: LazyLock<HashMap<String, Tool>> = LazyLock::new(|| {
HashMap::from([
(
String::from("get_weather"),
Tool {
schema: WEATHER_TOOL.clone(),
function: Box::new(|client, argv| {
let client_cloned = client.clone();
let argv_cloned = argv.clone();
async move { getWeather(&client_cloned, &argv_cloned).await }.boxed()
}),
},
),
])
});
Tool use or function calling
Ollama tool call: Ollama’s JSON tool call comprises a JSON Object containing a nested JSON Object carrying the name of the function and the arguments to pass to it. Add these nested struct definitions representing Ollama’s tool call JSON to your file:
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct OllamaToolCall {
pub(crate) function: OllamaFunctionCall,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct OllamaFunctionCall {
pub(crate) name: String,
arguments: IndexMap<String, String>,
}
Tool invocation: finally, here’s the tool invocation function. We call this function to execute any tool call we receive from Ollama response. It looks up the toolbox for the tool name. If the tool is resident, it runs it and returns the result, otherwise it returns a null.
pub async fn toolInvoke(
client: &Client,
function: OllamaFunctionCall,
) -> Result<Option<String>, String> {
if !TOOLBOX.contains_key(&function.name) {
return Ok(None);
}
let argv: Vec<String> = function
.arguments
.into_values()
.collect();
(TOOLBOX[&function.name].function)(client, &argv).await
}
That concludes our toolbox definition. Save and exit the file.
handlers
Edit src/handlers.rs:
server$ vi src/handlers.rs
imports
First modify the use imports at the top of the file:
- add to the
use serde_jsonblock:from_slice, to_string, - then add these new
uselines:use async_stream::stream; // ndjson_yield_sse use tokio::{ pin, }; // ndjson_yield_sse use tokio_postgres::Row; - and below
use crate::AppState;add:use crate::toolbox::{getWeather, toolInvoke, OllamaToolCall, OllamaToolSchema, TOOLBOX};
If you haven’t done llmChat tutorial or llmPlay project,
- add this line inside the
use axumblock:response::sse::{Event, Sse}, - add this line inside the
use stdblock:convert::Infallible, - then add these new
uselines:use futures::{Stream, StreamExt}; use regex::Regex;
structs
Next add or update the following structs:
- if update, add a tool-calls field to your
OllamaMessagestruct:#[derive(Debug, Default, Clone, Serialize, Deserialize)] pub struct OllamaMessage { role: String, content: String, #[serde(rename = "tool_calls", skip_serializing_if = "Option::is_none")] toolCalls: Option<Vec<OllamaToolCall>>, }If you have
llmchat()andllmplay()declared in yourhandlers.rs, you’d need to addtoolCalls: Noneto the creation ofOllamaMessageinstances. There should be only one occurrence in each ofllmchat()andllmplay(). - if update, add a tools field to your
OllamaRequest:#[derive(Debug, Default, Clone, Serialize, Deserialize)] pub struct OllamaRequest { appID: String, model: String, messages: Vec<OllamaMessage>, stream: bool, #[serde(skip_serializing_if = "Option::is_none")] tools: Option<Vec<OllamaToolSchema>>, }
The OllamaResponse struct (unchanged if exists):
#[derive(Default, Deserialize)]
#[allow(dead_code)]
pub struct OllamaResponse {
model: String,
created_at: String,
message: OllamaMessage,
done: bool,
}
For the /weather testing API, add also the following struct:
#[derive(Deserialize)]
pub struct Location {
lat: String,
lon: String,
}
weather
Let’s implement the handler for the /weather API that we can use to
test our getWeather() function later:
pub async fn weather(
State(appState): State<AppState>,
ConnectInfo(clientIP): ConnectInfo<SocketAddr>,
Json(loc): Json<Location>,
) -> Result<Json<Value>, (StatusCode, String)> {
let weather = getWeather(
&appState.client,
&vec![loc.lat.to_string(), loc.lon.to_string()],
)
.await;
match weather {
Err(err) => Err(logServerErr(clientIP, err)),
Ok(temperature) => {
logOk(clientIP);
Ok(Json(json!(temperature)))
}
}
}
llmtools
The underlying request/response handling of llmtools() is basically that of llmchat(),
however with all the mods needed to support tool calling, it’s simpler to just start the
llmtools() handler from scratch. We will name variables according to this scheme:
- camelCase for language-level data objects,
- snake_case for string version of data objects to be used with PostgreSQL or JSON, and, earlier,
- ALL_CAPS for immutable global toolbox and tool definitions.
To store the client’s conversation context/history with Ollama in the PostgreSQL
database, llmtools() first confirms that the client has sent an appID that can
be used to tag its entries in the database. Here’s the signature of llmtools().
The Json deserialization will check for the existence of appID and return an HTTP
error if it is absent:
pub async fn llmtools(
State(appState): State<AppState>,
ConnectInfo(clientIP): ConnectInfo<SocketAddr>,
Json(mut ollamaRequest): Json<OllamaRequest>, // will check for appID
) -> Result<Sse<impl Stream<Item = Result<Event, Infallible>>>, (StatusCode, String)> {
// retrieve client's tool(s)
}
Our goal here is to prepend all prior conversations between the client and Ollama as
context to the current prompt. The client’s appID allows us to identify its conversation
with Ollama stored in the PostgreSQL database—similar to how MCP tags JSON-RPC 2.0 messages
with a session ID. Once we confirm that the client has an appID,
we retrieve any tool definitions attached to the ollamaRequest carrying the prompt.
We will assemble these tools along with any tools the client may have previously sent
to Ollama, attached to an earlier prompt, and any tools resident on chatterd and attach
them all to the contextualized prompt request we will POST to Ollama. Replace // retrieve
client's tool(s) with:
// convert tools from client as JSON string (client_tools) and save to db;
// prepare ollama_request for re-use to be sent to Ollama:
// clear tools in request, to be populated later
let mut client_tools = String::new();
if let Some(clientTools) = ollamaRequest.tools {
// has device tools
// must marshal to string to store to db
client_tools = to_string(&clientTools).unwrap_or_default();
// reset tools, to be populated with
// accumulated tools below, without duplicates
ollamaRequest.tools = None;
}
// insert into DB
Then we insert the current prompt into the database, adding to the client’s conversation
history with Ollama. As shown in the example in Tool definition JSON section, the client’s current prompt could comprise of multiple elements in the messages
array of the ollamaRequest, but the tools will reside in a single tools array next to
the messages array. When there are multiple elements in an ollamaRequest, we want to
insert the tools only once. Below we have chosen to insert the tools only with the first
element of the messages array. Replace the comment // insert into DB with the following code:
let chatterDB = appState
.pgpool
.get()
.await
.map_err(|err| logServerErr(clientIP, err.to_string()))?;
// insert each message into the database
// insert client_tools only with the first message:
// reset it to empty after first message.
let messages = ollamaRequest.messages;
for msg in messages {
chatterDB
.execute(
"INSERT INTO chatts (username, message, id, appID, toolschemas) \
VALUES ($1, $2, gen_random_uuid(), $3, $4)",
&[&msg.role, &msg.content, &ollamaRequest.appID, &client_tools], //
)
.await
.map_err(|err| logClientErr(clientIP, StatusCode::NOT_ACCEPTABLE, err.to_string()))?;
// store device's tools only once
client_tools = String::new();
}
// assemble resident tools
To prepare the full assemblage of tools to send to Ollama, we first attach all the
tools resident on chatterd. Replace // assemble resident tools with:
// append all of chatterd's resident tools to ollamaRequest
ollamaRequest.tools = Some(
TOOLBOX
.values()
.map(|tool| tool.schema.clone())
.collect::<Vec<OllamaToolSchema>>(),
);
// reconstruct ollamaRequest
Then we retrieve the client’s conversation history, including the recently inserted,
current prompt, as the last entry, and put each as a separate element in the
ollamaRequest.messages array, taking care to accumulate any tool(s) present into
ollamaRequest.tools array instead. Replace // reconstruct ollamaRequest with:
// reconstruct ollamaRequest to be sent to Ollama:
// - add context: retrieve all past messages by appID,
// incl. the one just received, and attach them to
// ollamaRequest
// - convert each back to OllamaMessage and
// - insert it into ollamaRequest
// - add each message's clientTools to chatterd's resident tools
// already copied to ollamaRequest.tools.
ollamaRequest.messages = chatterDB
.query(
"SELECT username, message, toolcalls, toolschemas FROM chatts WHERE appID = $1 ORDER BY time ASC",
&[&ollamaRequest.appID],
)
.await
.map_err(|err| logServerErr(clientIP, err.to_string()))?
.into_iter()
.map(|row| OllamaMessage::fromRow(&row, &mut ollamaRequest.tools))
.collect();
// NDJSON to SSE stream transformation
Put the fromRow static method for OllamaMessage outside your llmtools() function,
for example right under, and also outside, the definition of pub struct OllamaMessage{},
at the top of the file:
impl OllamaMessage {
fn fromRow(row: &Row, reqTools: &mut Option<Vec<OllamaToolSchema>>) -> OllamaMessage {
if let Some(toolschemas) = row.get::<usize, Option<String>>(3)
&& let Some(clientTools) =
from_slice::<Option<Vec<OllamaToolSchema>>>(toolschemas.as_ref()).unwrap_or(None)
{
*reqTools = Some(match reqTools.clone() {
Some(mut tools) => {
// append tools to ollama_request.tools
tools.extend(clientTools);
tools
}
None => clientTools,
})
}
OllamaMessage {
role: row.get(0),
content: row.get(1),
toolCalls: if let Some(toolcalls) = row.get::<usize, Option<String>>(2) {
// has device tools
// must deserialize to type to append device tools to ollamaRequest.tools
from_slice::<std::option::Option<Vec<OllamaToolCall>>>(toolcalls.as_ref())
.unwrap_or(None)
} else {
None
},
}
}
}
ndjson_yield_sse
As we know, Ollama response is in the form of an NDJSON stream, which we
transform into a stream of SSE events using the ndjson_yield_sse function. We
pass this function to axum’s Sse constructor at the end of the llmtools() handler.
To satisfy Rust’s memory borrow checker, we first make a clone of the Postgres pool, pgpool
to be used inside ndjson_yield_sse().
In ndjson_yield_sse(), we first declare an accumulator variable, full_response, to
assemble the reply tokens Ollama streams to us. To accommodate resident-tool call, we use
a flag, sendNewPrompt, to indicate to our stream generator whether:
- to start a resident-tool call connection to Ollama and continue yielding results to the client or
- to conclude streaming to the connection.
While
sendNewPromptistrue—it is initialized totrue, we open a new POST connection to Ollama and send it theollamaRequestmessage. Add below// NDJSON to SSE stream transformation:let pgpool = appState.pgpool.clone(); let ndjson_yield_sse = stream! { let mut full_response = String::new(); let mut sendNewPrompt = true; while sendNewPrompt { sendNewPrompt = false; // assume no resident-tool call match appState.client .post(format!("{OLLAMA_BASE_URL}/chat")) .json(&ollamaRequest) // convert the request to JSON .send() // send request to Ollama .await { Err(err) => { yield Ok(Event::default().event("error") .data(json!({ "error": err.to_string() }).to_string())) }, Ok(response) => { // handle Ollama response } // Ok(response) } // match appState.send().await } // while sendNewPrompt }; // ndjson_yield_sse logOk(clientIP); Ok(Sse::new(ndjson_yield_sse))
Why not use ndjson_map_sse?
In llmchat() we chose to use ndjson_map_sse to generate an SSE stream because Rust
is still lacking a stable, general purpose generator, along with its yield operator.
Instead, ndjson_yield_sse uses the async-stream crate which has its own custom,
specialized implementation of yield.
For this tutorial, our design of the llmtools() handler injects resident-tool call
results into existing Ollama stream originating from the device. To use ndjson_map_sse
would require each resident-tool call to form a separate stream to be merged with
the existing end-to-end stream, a potentially complex operation. It would likely also
involve extra PostgreSQL database retrievals. We will continue to study this alternative
for future use.
We convert each NDJSON line to a language-level type, OllamaResponse in this case, with
semantically meaningful structure and fields that we can more easily manipulate than a linear
byte stream or string. If the conversion is unsuccessful and the model property of the
type is empty, we return an SSE error event and move on to the next NDJSON line. Otherwise,
we append the content of this OllamaResponse.messageto the full_response accumulator.
Replace // handle Ollama response with:
let byte_stream = response.bytes_stream();
pin!(byte_stream); // Pin the stream so we can loop over it
let mut tool_calls = String::new();
let mut tool_result = String::new();
while let Some(chunk) = byte_stream.next().await {
match chunk {
Err(err) => {
// error from Ollama
yield Ok(Event::default().event("error")
.data(json!({ "error": err.to_string() }).to_string(),))
}
Ok(bytes) => {
let line = String::from_utf8_lossy(&bytes).replace('\n', "");
// deserialize each line into OllamaResponse
let ollamaResponse: OllamaResponse = from_slice(&bytes).unwrap_or_default();
if ollamaResponse.model.is_empty() {
// didn't receive an ollamaresponse, report to client as error
yield Ok(Event::default().event("error")
.data(line.replace("\\\"", "'")));
continue
}
// append response token to full assistant message
full_response.push_str(&ollamaResponse.message.content);
// check for tool call
} // Ok(bytes)
} // match chunk
} // byte_stream.next()
// insert full response into db
The tool call field in OllamaResponse is an array, even though it looks like Qwen3 on Ollama
is presently limited to making only one tool call per HTTP round. We loop through the array and
for each tool call, we try to call its function by calling toolInvoke() from our toolbox.
If there is no tool call, we simply encode the full NDJSON line into an SSE Message event
and yield it as an element of the SSE stream and move on to the next NDJSON line, as we do
in llmchat. Replace // check for tool call with:
// is there a tool call?
if let Some(toolCalls) = ollamaResponse.message.toolCalls {
// convert toolCalls to JSON string (tool_calls) to be saved to db
tool_calls = to_string(&toolCalls).unwrap_or_default();
for toolCall in toolCalls {
if toolCall.function.name.is_empty() {
continue // LLM miscalled
}
let toolResult = toolInvoke(&appState.client, toolCall.function).await;
// handle tool result
} // for toolCall
} else {
// no tool call, send NDJSON line as SSE data line
yield Ok(Event::default().data(&line));
}
If the tool is resident, toolInvoke() returns the result of the tool call. There are three
possible outcomes from the call to toolInvoke():
- the tool is resident but the call was unsuccesfull and returns an error,
- the tool is resident and the call was successful, or
- the tool is non-resident.
If the result indicates that an error has occured, we are dealing with the first outcoe above.
We simply report the error to the client and move on to the next NDJSON line.
If there’s no error but toolInvoke() returns null result, this indicates that the tool is
non resident. We forward the tool call to the client as a tool_calls SSE event. Otherwise,
we prepare the result to be saved to PostgreSQL and return the result to Ollama.
Replace // handle tool result with:
match toolResult {
Err(err) => {
// outcome 1: tool resident but had error
yield Ok(Event::default().event("error")
.data(json!({ "error": err }).to_string()))
},
Ok(result) => {
if let Some(res) = result {
// outcome 2: tool call is resident and no error
// convert toolResult to JSON string (tool_result)
// to be saved to db
if !tool_result.is_empty() {
tool_result.push(' ');
tool_result.push_str(res.as_str());
} else {
tool_result = res.clone();
}
// create new OllamaMessage with tool result
// to be sent back to Ollama
let toolresultMsg = OllamaMessage {
role: "tool".to_string(),
content: res,
toolCalls: None,
};
ollamaRequest.messages.push(toolresultMsg);
// send result back to Ollama
sendNewPrompt = true;
} else {
// outcome 3: tool non resident, forward
// to device as 'tool_calls' SSE event
yield Ok(Event::default().event("tool_calls")
.data(&line))
}
} // Ok(result)
} // match toolResult
When we reach the end of the NDJSON stream, we insert the full Ollama response and any resident
tool calls and their results into PostgreSQL database as the assistant’s reply. Any error in
the insertion yields an SSE error event sent to the client. Replace // insert full response
into db with:
let wsRegex = Regex::new(r"[\s]+").unwrap();
match pgpool.get().await {
Err(err) => {
yield Ok(Event::default().event("error")
.data(json!({ "error": err.to_string() }).to_string()));
},
Ok(chatterDB) => {
// save full response, including tool call(s), to db,
// to form part of next prompt's history
if let Err(err) = chatterDB.execute(
"INSERT INTO chatts (username, message, id, appid, toolcalls) \
VALUES ('assistant', $1, gen_random_uuid(), $2, $3)",
&[&wsRegex.replace_all(&*full_response, " "), &ollamaRequest.appID, &tool_calls],
) .await {
yield Ok(Event::default().event("error")
.data(json!({ "error": err.to_string() }).to_string()))
}
// if there were resident tool call(s), save result(s)
if sendNewPrompt && let Err(err) = chatterDB.execute(
"INSERT INTO chatts (username, message, id, appid) \
VALUES ('tool', $1, gen_random_uuid(), $2)",
&[&tool_result, &ollamaRequest.appID],
) .await {
yield Ok(Event::default().event("error")
.data(json!({ "error": err.to_string() }).to_string()))
}
} // Ok(chatterDB)
} // match pgppool
We’re done with handlers.rs! Save and exit the file.
main.rs package
Edit src/main.rs:
server$ vi src/main.rs
Find mod handlers; and add below it:
mod toolbox;
Find the router = Router::new() instantiation statement and add these routes right
after the route for /llmprompt:
.route("/llmtools", post(handlers::llmtools))
.route("/weather", get(handlers::weather))
We’re done with main.rs. Save and exit the file.
Build and test run
To build your server:
server$ cargo build --release
As before, it will take some time to download and build all the 3rd-party crates. Be patient.
Linking error with cargo build?
When running cargo build --release, if you see:
error: linking with cc failed: exit status: 1
note: collect2: fatal error: ld terminated with signal 9 [Killed]
below a long list of object files, try running cargo build --release again. It usually works the second time around, when it will have less remaining linking to do. If the error persisted, please talk to the teaching staff.
![]()
Rust is a compiled language, like C/C++ and unlike Python, which is an interpreted language. This means you must run cargo build each and every time you made changes to your code, for the changes to show up in your executable.
To run your server:
server$ sudo ./chatterd
# Hit ^C to end the test
Return to the Testing your /llmtools API section.
| Prepared by Xin Jie ‘Joyce’ Liu, Chenglin Li, and Sugih Jamin | Last updated August 26th, 2025 |