Project 3: llmAction Back End

Cover Page

DUE Mon, 12/08, 6 pm

For the back end, regardless of your stack of choice, you would be using functions from both the llmTools and Signin tutorials.

Toolbox

To add ollama_cli as a tool, first add a global variable OLLAMA_TOOL, whose definition follows its counterpart WEATHER_TOOL from the llmTools tutorial. The current implementation of toolInvoke() uses positional parameters in calling the tool function, i.e., arguments are matched to their intended parameters by their positions in the list of arguments. Unfortunately, when making tool call, the model qwen3 seems to list arguments in alphanumeric order, instead of the provided parameter order. For example, parameters “cmd” and “arg” will be listed with the parameter “arg” first, followed by “cmd”, which is incompatible with our use of positional parameter. We will update toolInvoke() to use key-value pairs to pass along a dictionary/map of parameters along with their values in future iteration, for now, please name your tool parameters in alphanumeric order according to the parameter order intended. For example, “arg1” would carry the authentication token, chatterID, “arg2” would carry the ollama command “ls”, “pull”, “rm”, etc. and an optional “arg3” would carry the model name when the command is “pull” or “rm”.

We will add the ollamaCli() function in the next subsection. Assuming we already have ollamaCli() defined, add an entry to the TOOLBOX dictionary/map to register the ollama_cli tool, again along the line of the get_weather entry for TOOLBOX in the llmTools tutorial. Note that the Go implementation requires the tool’s argument order be further listed in the TOOLBOX.

ollamaCli() function

We provide an implementation of ollamaCli() below that you can copy and paste to your toolbox source file. Our ollamaCli() calls checkAuth() to verify the validity of the chatterID passed to it. If validation succeeds, checkAuth() returns a “no errror” indication. If validation fails, chatAuth() returns an error containing the message “401 Unauthorized: chatterID verification failed, probably expired and token expunged”. If validation fails due to any kind of error in the validation process, checkAuth() returns the error. Note that this is not a very secure validation process: the chatterID itself is not attributed to any user, for example. In the attached code, we provide the function signature of checkAuth() that ollamaCli() expects. You can consult and adapt the postauth() function from the Signin tutorial to implement checkAuth().

Once the chatterID is verified, ollamaCli() forks a process to run the ollama command. If the return code from the ollama command indicates no error (return code 0), we send back the stdout output of the command. However, if the stdout is empty, we construct and return a string informing the model that the command has succeeded. The model needs more explicit and verbose confirmation message than an empty string. If an error has occurred and an error message is output on stderr, we return the stderr message to the model as normal output message. The model will parse the message and recognize it as an error on its own. Sometimes a command prints out progress notifications on stderr. These messages may contain words that the model could misinterpret as indication of failure. Hence unless the command’s return code is non-zero, we do not forward any stderr messages to the model.

Implementation of ollamaCli() you can add to your toolbox source file: | Go | Python | Rust | TypeScript |

llmTools

The llmTools() function as implemented in the llmTools tutorial has two shortcomings:

  1. line reading: in both the Rust and Typescript versions the stream reader may return more than one line of NDJSON at a time and it may return a partial last line. The Go version doesn’t have either issues, but it is limited to reading 64 KB at a time. This is not a problem for us right now, but could become an issue if we haven to handle large image and video data. Python’s line iterator has neither of these issues.
  2. tool call context: in returning a tool call result to the model, the model must see its own tool call in the context leading up to the tool result. Otherwise the model may not be able to tell consistently that the result is for the tool call. We observed this even after tagging the JSON message carrying the tool result with a tool_name field. When tools are called one after another, all interaction history, including prior tool call(s) and their corresponding tool result(s) leading up to the current one must be included in the prompt context. Otherwise, the model may lose its place in its thinking process. Returning the thinking process in the context also helps the model orient itself in its plan, as oppposed to having to reconstruct the plan all over again—although it is not obvious whether the model can more efficiently re-position itself given its plan in context or reconstruct the plan from the original prompt and intermediate tool call results.

We provide an updated implementation of llmTools() below that you can use to replace the function of the same name from the llmTools tutorial. The updated llmTools() addresses both of the above issues. We also add a “system” prompt to instruct the model of a few Ollama’s commands. | Go | Python | Rust | TypeScript |

Testing

As with the llmTools tutorial, you can test your implementation of olamaCli(), without the HITL guardrail, by adding an API endpoint that calls the tool directly. A full test of the tool with HITL would have to wait until your frontend is implemented.

As usual, you can use either graphical tool such as Postman or CLI tool such as curl to test.

ollama API

In your main source file, add an /ollama HTTP POST API endpoint with ollama() as its handler. Then in the handlers source file, create an OllamaCmd struct/class with three properties: arg1, arg2, and arg3. The first two properties should be of type string and the last one an optional string. The ollama() handler would deserialize the body of the HTTP request it is given into an instance of OllamaCmd, assemble the three properties of OllamaCmd into an array of strings, use the array to call ollamaCli(), and return the result of the call as an HTTP response. Except in Go, your handler would likely have to explicitly import ollamaCli() from your toolbox. Mock the checkAuth() in your toolbox to always return no error.

Then in Postman/curl, send a HTTP POST to your /ollama API endpoint with the following JSON body:

{
    "arg1": "chatterID",
    "arg2": "ls"
}

which should return you a table listing the models available on Ollama.

Once your frontend is completed, you can do end-to-end testing, see End-to-end Testing section of the spec.

And with that, you’re done with your backend. Congrats!

Back-end submission guidelines

As usual, git commit your changes to your chatterd source files with the commit message, "pa3 back end", and push your changes to your git repo.

:point_right:WARNING: You will not get full credit if your front end is not set up to work with your back end!

Everytime you rebuild your Go or Rust server or make changes to either of your JavaScript or Python files, you need to restart chatterd:

server$ sudo systemctl restart chatterd

:warning:Leave your chatterd running until you have received your assignment grade.

:point_right:TIP:

server$ sudo systemctl status chatterd

is your BEST FRIEND in debugging your server. If you get an HTTP error code 500 Internal Server Error or if you just don’t know whether your HTTP request has made it to the server, first thing you do is run sudo systemctl status chatterd on your server and study its output.

If you’re running a Python server, it also shows error messages from your Python code, including any debug printouts from your code. The command systemctl status chatterd is by far the most useful go-to tool to diagnose your back-end server problem.


Prepared by Xin Jie ‘Joyce’ Liu, Chenglin Li, Sugih Jamin Last updated: November 22nd, 2025