Tutorial: llmTools

Course Schedule

LLM tool use: or equivalently, “function calling”, is the main enabler of agentic AI, allowing LLMs to accomplish tasks such as looking up information on the web, making a reservation, buying tickets, etc. In this llmTools tutorial and the subsequent llmAction project, we give LLMs access to web and cloud services, UNIX command-line interface (CLI) tools on the back end, and front-end tools such as reading GPS location and biometric check.

The most widely known tool-use infrastructure is the open-standard Model Context Protocol (MCP). However a full MCP tool definition involves satisfying a good amount of constraints and requirements, requiring a substantial amount of boiler-plate code. There is increasing realization that tools defined using MCP consume a large number of tokens just to get loaded onto LLMs. This has led to development of agent frameworks such as OpenClaw that was designed to use mainly CLI tools instead of MCP.

Our tutorial and project were designed from the start to use only CLI tools.

About the tutorial

This tutorial may be completed individually or in teams of at most 2. You can partner differently for each tutorial.

Treat your messages sent to chatterd and Ollama as public utterances with no reasonable expectation of privacy and know that these are recorded for the purposes of carrying out a contextual interaction with Ollama.

Expected behavior

Our tutorial app does only one thing: to answer the question, “What is the weather at my location?”. It does this by using two tools: get_location(), which queries your device for its GPS lat/lon coordinates, and get_weather(), which takes the lat/lon of a location as arguments to query the open-source weather service Open Meteo, which has a free-to-use API.

DISCLAIMER: the video demo shows you one aspect of the app’s behavior. It is not a substitute for the spec. If there are any discrepancies between the demo and the spec, please follow the spec. The spec is the single source of truth. If the spec is ambiguous, please consult the teaching staff for clarification.

Objectives

Front end:

Back end:

API and protocol handshakes

We add one new “production” API to Chatter:

And one “testing” APIs for use with development tools like Postman (but not with your front end):

Using this syntax:

url-endpoint
-> request: data sent to Server
<- response: data sent to Client

The protocol handshakes consist of:

/llmtools
-> HTTP POST { appID, model, messages, stream, tools }
<- { SSE event-stream lines } 200 OK

where each element in the messages array can carry no, one, or more tool calls.

/weather
-> HTTP GET { lat, lon }
<- { "Weather at lat: 42.29272, lon: -83.71627 is 56.5ºF" } 200 OK

Data formats

To post a prompt to Ollama with the llmtools API, the front-end client sends a JSON Object consisting of an appID field, to uniquely identify this client device for PostgreSQL database sharing, to store the user’s prompt context; a model field, for the LLM model we want Ollama to use, a messages field for the prompt itself (more details below); a stream field to indicate whether we want Ollama to stream its response or to batch and send it in one message; and a tools field, explained below.

Tool definition JSON

A request from the client to Ollama can carry a tools field, which is a JSON Array of tool signatures encoded as JSON Objects. For example, here’s a request with the signature of a single tool, get_location:

{	
  "appID": "edu.umich.reactive.DUMMY_UNIQNAME",
  "model": "qwen3",
  "messages": [
      { "role": "user", "content": "What is the lat/lon at my location?" }
  ],
  "stream": true,
  "tools": [
      {
          "type": "function",
          "function": {
              "name": "get_location",
              "description": "Get current location",
              "parameters": null
          }
      }
  ]
}

The messages field is mostly the same as in the llmChat tutorial: a JSON Array with one or more JSON Objects as its elements. In the example above, there is only one JSON Object in the messages array. Each element of messages consists of a role field, which can be "system", to give instructions (“prompt engineering”) to the model, "user", to carry user’s prompt, "assistant", to indicate reply (“prompt completion”) from the model, etc. The content field hold the actual instruction, prompt, reply, etc. from the entity listed in role.

Let’s save that definition for the get_location tool for later use:

Git add your tools folder and the get_location.json file, commit your changes, and push to your git repo.

Tool call JSON

A response from the model in Ollama can carry a tool_calls field in the JSON Object of messages. A tool_calls field is a JSON Array listing the tools the model wants to call. It looks like Ollama API surface doesn’t yet support parallel tool calls–even though Qwen3, for example, supports parallel tool calls. In this tutorial we will explore only serial tool calls. Each tool call is specified as a JSON Object, e.g., here’s an example prompt completion from qwen3 calling the get_location tool:

{
    "model": "qwen3",
    "created_at": "2025-10-20T18:13:28.011173Z",
    "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
            {
                "function": {
                    "name": "get_location",
                    "arguments": {}
                }
            }
        ]
    },
    "done": false
}

There is no special format for sending tool call results back to Ollama. Briefly, each tool call completes an HTTP POST request sent to Ollama’s /chat API. Results of the tool call are sent as a new HTTP POST request to Ollama’s /chat API, with the results stored in the content field of the new HTTP POST request. Each HTTP POST request with an appID always prepend all the exchanges between the client and Ollama with the same appID, up to and including the last tool call. This gives the LLM the context it needs to interpret the new message as a response to its last tool call result. We will elaborate on this exchange in the specifications for the back end.

Specifications

As in previous tutorials, you only need to build one front end, AND one of the alternative back-end stacks. To receive full credit, your front end MUST work with your own back end and with mada.eecs.umich.edu (see below).

You should start this tutorial by working on the back end:

before tackling the front end:

End-to-end Testing

You will need a working front end to fully test your back end’s handling of tool calls. Once you have your front end implemented, first test it against the provided back end on mada.eecs.umich.edu: change the serverUrl property in your ChattStore to mada.eecs.umich.edu.

With qwen3 specified as the model in your ChattViewModel, send a request to mada with the prompt, “What’s the weather at my location?” After a long <think></think> process, you should see Qwen3 reporting the temperature at your current lat/lon position.

Of all the models available on Ollama, only the qwen3:8b (= qwen3) model works well for tool use. Even gpt-oss:20b that is supposed to support tool use doesn’t work reliably. Unfortunately, you cannot run qwen3:8b on a *-micro instance. You’d need more than 6 GB of memory to run qwen3:8b comfortably. To be graded, your front end must work with the Ollama accessible from mada.

Since mada is a shared resource and Ollama is single tasking, you would have to wait your turn if others are using mada. If your laptop has the necessary amount of memory, you can use it to test the tutorial locally before testing it on mada. In any case, don’t wait until the deadline to test your code and then get stuck behind a long line of classmates trying to access mada.

Limited end-to-end back end testing

Due to the limited resources of *-micro instances, please pull model qwen3:0.6b to your Ollama instance. This model works ok for tool calling, as long as you don’t ask it to chain tool calls. For example, it won’t be able to reason that it must first call get_location to obtain lat/lon, and then use these to call get_weather.

Instead, you have to do the dependency resolution for it. With qwen3:0.6b specified as the model in ChattViewModel:

  1. to test the font end tool, get_location, send the prompt, “Get my location using the get_location tool.”
  2. once the location is shown on screen, to test the back-end tool, get_weather, send the next prompt, “What’s the weather at my location?” The model should recognize that there’s a tool reply with lat/lon in the context sent with the second prompt and be able to use them to make the second tool call.

To get full credit, your back-end implementation running on a *-micro instance must pass this test. When submitting your front end, make sure your serverUrl is set to YOUR_SERVER_IP so that we know what your server IP is. You will not get full credit if your front end is not set up to work with your back end!


Prepared by Sugih Jamin Last updated: March 7th, 2026