Project 3: llmAction Back End
Cover Page
DUE Mon, 12/08, 6 pm
For the back end, regardless of your stack of choice, you would be using functions
from both the llmTools and Signin tutorials.
Toolbox
To add ollama_cli as a tool, first add a global variable OLLAMA_TOOL, whose definition
follows its counterpart WEATHER_TOOL from the llmTools tutorial. The current
implementation of toolInvoke() uses positional parameters in calling the tool function,
i.e., arguments are matched to their intended parameters by their positions in the list
of arguments. Unfortunately, when making tool call, the model qwen3 seems to list
arguments in alphanumeric order, instead of the provided parameter order. For example,
parameters “cmd” and “arg” will be listed with the parameter “arg” first, followed by “cmd”,
which is incompatible with our use of positional parameter. We will update toolInvoke()
to use key-value pairs to pass along a dictionary/map of parameters along with their values
in future iteration, for now, please name your tool parameters in alphanumeric order
according to the parameter order intended. For example, “arg1” would carry the authentication
token, chatterID, “arg2” would carry the ollama command “ls”, “pull”, “rm”, etc.
and an optional “arg3” would carry the model name when the command is “pull” or “rm”.
We will add the ollamaCli() function in the next subsection. Assuming we already have
ollamaCli() defined, add an entry to the TOOLBOX dictionary/map to register the
ollama_cli tool, again along the line of the get_weather entry for TOOLBOX in
the llmTools tutorial. Note that the Go implementation requires the tool’s argument
order be further listed in the TOOLBOX.
ollamaCli() function
We provide an implementation of ollamaCli() below that you can copy and paste to your
toolbox source file. Our ollamaCli() calls checkAuth() to verify the validity of
the chatterID passed to it. If validation succeeds, checkAuth() returns a “no errror”
indication. If validation fails, chatAuth() returns an error containing the message
“401 Unauthorized: chatterID verification failed, probably expired and token expunged”.
If validation fails due to any kind of error in the validation process, checkAuth()
returns the error. Note that this is not a very secure validation process: the
chatterID itself is not attributed to any user, for example. In the attached code,
we provide the function signature of checkAuth() that ollamaCli() expects. You
can consult and adapt the postauth() function from the Signin tutorial to
implement checkAuth().
Once the chatterID is verified, ollamaCli() forks a process to run the ollama
command. If the return code from the ollama command indicates no error (return code 0),
we send back the stdout output of the command. However, if the stdout is empty, we
construct and return a string informing the model that the command has succeeded. The
model needs more explicit and verbose confirmation message than an empty string. If an
error has occurred and an error message is output on stderr, we return the stderr message
to the model as normal output message. The model will parse the message and recognize it
as an error on its own. Sometimes a command prints out progress notifications on stderr. These
messages may contain words that the model could misinterpret as indication of failure. Hence
unless the command’s return code is non-zero, we do not forward any stderr messages to the model.
Implementation of ollamaCli() you can add to your toolbox source file:
|
Go | Python | Rust | TypeScript |
llmTools
The llmTools() function as implemented in the llmTools tutorial has two shortcomings:
- line reading: in both the Rust and Typescript versions the stream reader may return more than one line of NDJSON at a time and it may return a partial last line. The Go version doesn’t have either issues, but it is limited to reading 64 KB at a time. This is not a problem for us right now, but could become an issue if we haven to handle large image and video data. Python’s line iterator has neither of these issues.
-
tool call context: in returning a tool call result to the model, the model must
see its own tool call in the context leading up to the tool result. Otherwise the model
may not be able to tell consistently that the result is for the tool call. We observed this
even after tagging the JSON message carrying the tool result with a
tool_namefield. When tools are called one after another, all interaction history, including prior tool call(s) and their corresponding tool result(s) leading up to the current one must be included in the prompt context. Otherwise, the model may lose its place in its thinking process. Returning the thinking process in the context also helps the model orient itself in its plan, as oppposed to having to reconstruct the plan all over again—although it is not obvious whether the model can more efficiently re-position itself given its plan in context or reconstruct the plan from the original prompt and intermediate tool call results.
We provide an updated implementation of llmTools() below that you can use to replace the function
of the same name from the llmTools tutorial. The updated llmTools() addresses both of the
above issues. We also add a “system” prompt to instruct the model of a few Ollama’s commands.
|
Go | Python | Rust | TypeScript |
Testing
As with the llmTools tutorial, you can test your implementation of olamaCli(),
without the HITL guardrail, by adding an API endpoint that calls the tool
directly. A full test of the tool with HITL would have to wait until your frontend is implemented.
As usual, you can use either graphical tool such as Postman or CLI tool such as curl to test.
ollama API
In your main source file, add an /ollama HTTP POST API endpoint with ollama() as its handler.
Then in the handlers source file, create an OllamaCmd struct/class with three properties: arg1,
arg2, and arg3. The first two properties should be of type string and the last one an
optional string. The ollama() handler would deserialize the body of the HTTP request it
is given into an instance of OllamaCmd, assemble the three properties of OllamaCmd into
an array of strings, use the array to call ollamaCli(), and return the result of the call
as an HTTP response. Except in Go, your handler would likely have to explicitly import
ollamaCli() from your toolbox. Mock the checkAuth() in your toolbox to always return no error.
Then in Postman/curl, send a HTTP POST to your /ollama API endpoint with the following JSON body:
{
"arg1": "chatterID",
"arg2": "ls"
}
which should return you a table listing the models available on Ollama.
Once your frontend is completed, you can do end-to-end testing, see End-to-end Testing section of the spec.
And with that, you’re done with your backend. Congrats!
Back-end submission guidelines
As usual, git commit your changes to your chatterd source files with the
commit message, "pa3 back end", and push your changes to your git repo.
WARNING: You will not get full credit if your front end is not set
up to work with your back end!
Everytime you rebuild your Go or Rust server or make changes to either of your
JavaScript or Python files, you need to restart chatterd:
server$ sudo systemctl restart chatterd
Leave your chatterd running until you have received your assignment grade.
TIP:
server$ sudo systemctl status chatterd
is your BEST FRIEND in debugging your server. If you get an HTTP error code 500 Internal Server Error or if you just don’t know whether your HTTP request has made it to the server, first thing you do is run sudo systemctl status chatterd on your server and study its output.
If you’re running a Python server, it also shows error messages from your Python code, including any debug printouts from your code. The command systemctl status chatterd is by far the most useful go-to tool to diagnose your back-end server problem.
| Prepared by Xin Jie ‘Joyce’ Liu, Chenglin Li, Sugih Jamin | Last updated: November 22nd, 2025 |