Python with Starlette

Cover Page

Back-end Page

We assume that your chatterd code base has accumulated code up to at least the llmChat back end.

toolbox

Let us start by creating a toolbox to hold our tools. Change to your chatterd folder and create a new Python file, name it toolbox.py:

server$ cd ~/reactive/chatterd
server$ vi toolbox.py

Put the following imports at the top of the file:

from collections.abc import Awaitable, Callable
from dataclasses import dataclass, field
from dataclasses_json import dataclass_json, config
from http import HTTPStatus
import httpx
import pkgutil

The contents of this file can be categorized into three purposes: tool/function definition, the toolbox itself, and tool use (or function calling).

Tool/function definition

Ollama tool schema: at the top of Ollama’s JSON tool definition is a JSON Object respresenting a tool schema. The tool schema is defined using nested JSON Objects and JSON Arrays. Add the full nested definitions of Ollama’s tool schema to your file:

@dataclass_json
@dataclass
class OllamaParamProp:
    type:        str
    description: str
    enum:        list[str] | None = None

@dataclass_json
@dataclass
class OllamaFunctionParams:
    type:       str
    properties: dict[str, OllamaParamProp]
    required:   list[str] | None = None     # parameters MUST be in function-signature order

@dataclass_json
@dataclass
class OllamaToolFunction:
    name:        str
    description: str
    parameters:  OllamaFunctionParams | None = None

@dataclass_json
@dataclass
class OllamaToolSchema:
    type: str
    function: OllamaToolFunction

Weather tool schema: in this tutorial, we have only one tool resident on the back end, get_weather. Instead of manually instantiating an OllamaToolSchema for each tool, we read in the JSON schema file using Python’s pkgutil module into a string. Add the following line at the top-level of your toolbox.py file to read in the get_weather.json tool schema:

WEATHER_JSON = pkgutil.get_data(__name__, "tools/get_weather.json").decode("utf-8")

Weather tool function: we implement the get_weather tool as a getWeather() function that makes an API call to the free Open Meteo weather service. Add the following class definitions to hold Open Meteo’s return result. For this tutorial, we’re only interested in the latitude, longitude, and temperature returned by Open Meteo:

@dataclass_json
@dataclass
class Current:
    temp: float = field(
        default = 0.0,
        metadata = config(field_name="temperature_2m")
    )

@dataclass_json
@dataclass
class OMeteoResponse:
    latitude: float
    longitude: float
    current: Current

Here’s the definition of the getWeather() function:

async def getWeather(argv: list[str]) -> tuple[str | None, str | None]:
    # Open-Meteo API doc: https://open-meteo.com/en/docs#api_documentation
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                url=f"https://api.open-meteo.com/v1/forecast?latitude={argv[0]}&longitude={argv[1]}&current=temperature_2m&temperature_unit=fahrenheit",
            )
            if response.status_code != HTTPStatus.OK:
                return None, f"Open-meteo response: {response.status_code}"

            ometeoResponse = OMeteoResponse.from_json(response.content)
            return f"Weather at lat: {ometeoResponse.latitude}, lon: {ometeoResponse.longitude} is {ometeoResponse.current.temp}ºF", None
    except Exception as err:
        return None, f"Cannot connect to Open Meteo: {err}"

The toolbox

Even though we have only one resident tool in this tutorial, we want a generalized architecture that can hold multiple tools and invoke the right tool dynamically. To that end, we’ve chosen to use a switch table (or jump table or, more fancily, service locator registry) as the data structure for our tool box. We implement the switch table as a dictionary. The “keys” in the dictionary are the names of the tools/functions. Each “value” is a record containing the tool’s definition/schema and a pointer to the function implementing the tool. To send a tool as part of a request to Ollama, we look up its schema in the switch table and copy it to the request. To invoke a tool called by Ollama in its response, we look up the tool’s function in the switch table and invoke the function.

Add the following type for an async tool function and the record type containing a tool definition and the async tool function:

type ToolFunction = Callable[[list[str]], Awaitable[tuple[str | None, str | None]]]

@dataclass
class Tool:
    schema: OllamaToolSchema
    function: ToolFunction

Now create a switch-table toolbox and populate it with the weather tool by feeding the WEATHER_JSON schema string to from_json() from dataclasses_json to be deserialized into an instance of OllamaToolSchema (the type of the schema property):

TOOLBOX: dict[str, Tool] = {
    "get_weather": Tool(OllamaToolSchema.from_json(WEATHER_JSON), getWeather),
}

Tool use or function calling

Ollama tool call: Ollama’s JSON tool call comprises a JSON Object containing a nested JSON Object carrying the name of the function and the arguments to pass to it. Add these class definitions representing Ollama’s tool-call JSON to your toolbox.py file:

@dataclass_json
@dataclass
class OllamaFunctionCall:
    name:      str
    arguments: dict[str, str]

@dataclass_json
@dataclass
class OllamaToolCall:
    function: OllamaFunctionCall

Tool invocation: finally, here’s the tool invocation function. We call this function to execute any tool call we receive from Ollama response. It looks up the toolbox for the tool name. If the tool is resident, it runs it and returns the result, otherwise it returns a null.

async def toolInvoke(function: OllamaFunctionCall) -> tuple[str | None, str | None]:
    tool = TOOLBOX.get(function.name)
    if tool:
        # get arguments in order, they may arrive out of order from Ollama
        argv = [function.arguments[prop] for prop in tool.schema.function.parameters.required]
        return await tool.function(argv)
    return None, None

That concludes our toolbox definition. Save and exit the file.

handlers

Edit handlers.py:

server$ vi handlers.py

imports

Add the following imports:

class

Next update the following classes:

For the /weather testing API, add also the following class:

@dataclass
class Location:
    lat: str
    lon: str

weather

Let’s implement the handler for the /weather API that we can use to test our getWeather() function later:

async def weather(request):
    try:
        loc = Location(**(await request.json()))
    except Exception as err:
        print(f'{err=}')
        return JSONResponse(f'Unprocessable entity: {str(err)}',
                            status_code = HTTPStatus.UNPROCESSABLE_ENTITY)

    temp, err = await getWeather([loc.lat, loc.lon])
    return JSONResponse({"error": f'Internal server error: {str(err)}'},
        status_code = HTTPStatus.INTERNAL_SERVER_ERROR) if err else JSONResponse(temp) 

llmtools

The underlying request/response handling of llmtools() is basically that of llmchat(), plus the mods needed to support tool calling. We will name variables according to this scheme:

Make a copy of your llmchat() function and rename it llmtools(). In your newly renamed llmtools() function, after deserialiaing request.body() to OllamaRequest, serialize any tools present in the OllamaRequest so that we can add them to the PostgreSQL database:

    #1 tab
    # convert tools from client as JSON string (client_tools) to be saved to db
    client_tools = ""
    if ollamaRequest.tools:
        try:
            client_tools = json.dumps([tool.to_dict() for tool in ollamaRequest.tools])
        except Exception as err:
            return JSONResponse(
                { "error": f"Serializing request tools: {type(err).__name__}: {str(err)}" },
                status_code=HTTPStatus.UNPROCESSABLE_ENTITY,
            )

Next, when inserting each message into the database, store the client’s tools also, but if there are more than one messages in the messages array, store the tools only once, with the first message. Replace await cur.execute("INSERT...) in the for msg in ollamaRequest.message block with the following:

                    await cur.execute(
                        "INSERT INTO chatts (name, message, id, appid, toolschemas) VALUES (%s, %s, gen_random_uuid(), %s, %s);",
                        (msg.role, msg.content, ollamaRequest.appID, client_tools),
                    )
                    # store client_tools only once
                    # reset it to empty after first message
                    client_tools = None

The llmchat() code next reconstructs ollamaRequest to be sent to Ollama by retrieving from the PostgreSQL database all prior exchanges between the client and Ollama using the client’s appID. In llmtools(), we first populate ollamRequest.tools with tools resident on the chatterd back end before reconstructing the ollamaRequest. Add before the # reconstruct ollamaRequest to be sent to Ollama: comment:

            #3 tabs
            # append all of chatterd's resident tools to ollamaRequest.tools;
            # front-end tools will be added back later, as part of reconstructing
            # the appID's context from the db (see OllamaMessage.fromRow())
            ollamaRequest.tools = [tool.schema for tool in TOOLBOX.values()]

As the comments above indicated, we will need a fromRow() method added to your OllamaMessage class. Add the following inside your OllamaMessage class:

    #1 tab
    @staticmethod
    def fromRow(row, ollamaRequest):
        try:
            toolcalls = []
            if row[2]:
                # must deserialize to type to append toolcalls
                toolcalls = [OllamaToolCall.from_dict(tool_call) for tool_call in json.loads(row[2])]

            ollamaRequest.messages.append(OllamaMessage(role=row[0], content=row[1], toolCalls=toolcalls))

            if row[3]:
                # has front-end device tools
                # must deserialize to type to append device tools to ollamaRequest.tools
                ollamaRequest.tools.extend([OllamaToolSchema.from_dict(tool) for tool in json.loads(row[3])])
        except Exception as err:
            raise err

The method fromRow() appends a previous exchange between the client and Ollama stored in the given row, including any tool calls Ollama has made, into the OllamaRequest.messages array. Then it appends any tools the front-end provided to the OllamaRequest.tools array. This array has previously been populated with available resident back-end tools.

Back in llmtools(), replace the following lines:

                await cur.execute("SELECT name, message FROM chatts WHERE appid = %s ORDER BY time ASC;",
                                (ollamaRequest.appID,))
                rows = await cur.fetchall()
                ollamaRequest.messages = [OllamaMessage(role=row[0], content=row[1]) for row in rows]

with (at the same identation level, inside the try-except block):

                await cur.execute(
                    "SELECT name, message, toolcalls, toolschemas FROM chatts WHERE appID = %s ORDER BY time ASC;",
                    (ollamaRequest.appID,),
                )
                rows = await cur.fetchall()
                for row in rows:
                    OllamaMessage.fromRow(row, ollamaRequest)

ndjson_yield_sse

To accommodate resident-tool call, we use a flag, sendNewPrompt, to indicate to our stream generator whether we have any prompt to send to Ollama. Initially, sendNewPrompt is set to True to always send the prompt from the front end. Subsequently, if Ollama makes a call for a tool resident on the back end, we will send the result of the tool call as a new prompt to Ollama. At the top of your async def ndjson_yield_sse() definition, add:

        #2 tabs
        sendNewPrompt = True

After creating the context manager for conn and cur, put the existing try-except blocks inside this while loop:

                #4 tabs
                while sendNewPrompt:
                    sendNewPrompt = False  # assume no resident-tool call

                    # leave existing try: block
                    # and except: block here

Whereas previously in llmchat() we simply yielded each data line after appending it to the tokens array, we now must check whether there’s a tool call and yield the data line only if there were no tool call. Replace the following lines:

                                    # send NDJSON line as SSE data line
                                    yield {
                                        "data": line
                                    }

with: ```py # is there a tool call? if ollamaResponse.message.toolCalls: # handle tool calls

                                else:
                                    # no tool call, send NDJSON line as SSE data line
                                    yield {"data": line} ```

In handling tool calls, we first serialize the tool call back into a JSON string to be saved into the database. Replace the comment # handle tool calls with:

                                        # convert toolCalls to JSON string (tool_calls) to be saved to db
                                        tool_calls = json.dumps([toolCall.to_dict() for toolCall in ollamaResponse.message.toolCalls])

                                        for (toolCall) in ollamaResponse.message.toolCalls:
                                            # but assuming one tool call per response
                                            if not toolCall.function.name:
                                                continue  # LLM miscalled

                                            # save full response, including tool call(s), to db,
                                            # to form part of next prompt's history
                                            await cur.execute(
                                                "INSERT INTO chatts (name, message, id, appID, toolcalls) \
                                                VALUES (%s, %s, gen_random_uuid(), %s, %s);",
                                                ("assistant", "".join(tokens), ollamaRequest.appID, tool_calls,),
                                            )
                                            
                                            # clear tokens and tool_calls, we already stored them
                                            tokens.clear()
                                            tool_calls = ""
                                            
                                            # make the tool call
                                            

We call toolInvoke() with the tool’s signature and process the result. There are three possible outcomes from the call to toolInvoke():

  1. the tool is resident but the call was unsuccesfull and returned an error,
  2. the tool is resident and the call was successful, or
  3. the tool is non-resident.

If the tool call resulted in an error, we store the error as the tool result. We add the tool call and its result to the OllamaRequest message and set the flag (sendNewPrompt) to send the OllamaRequest back to Ollama. We also store both the tool call and its result to the database, to form part of this appID’s context. If the tool call resulted in neither an error nor any returned result, we interpret that as the tool being non-resident on the back end and forward the tool call to the front end as an SSE tool_calls event. Replace the comment # make the tool call with:

                                            tool_result, tool_err = await toolInvoke(
                                                toolCall.function
                                            )
                                            if tool_err:
                                                # outcome 1: tool resident but had error
                                                # send error back to LLM, don't report to frontend
                                                tool_result = tool_err

                                            if tool_result:
                                                # outcomes 1 & 2 (tool call is resident and no error)
                                                # reuse OllamaMessage to carry tool result
                                                # to be sent back to Ollama
                                                # first append the tool call itself
                                                ollamaRequest.messages.append(
                                                    ollamaResponse.message
                                                )
                                                # then append the result
                                                ollamaRequest.messages.append(
                                                    OllamaMessage(
                                                        role="tool",
                                                        content=tool_result,
                                                    )
                                                )
                                                
                                                # don't send tools multiple times
                                                ollamaRequest.tools = None
                                                # loop to send tool result back to Ollama
                                                sendNewPrompt = True

                                                # save resident tool call result or error message
                                                await cur.execute(
                                                    "INSERT INTO chatts (name, message, id, appID)\
                                                     VALUES (%s, %s, gen_random_uuid(), %s);",
                                                    ("tool", re.sub(r"\s+", " ", tool_result), ollamaRequest.appID,),
                                                )
                                            else:
                                                # outcome 3: tool non resident, forward to
                                                # front end as 'tool_calls' SSE event
                                                yield {
                                                    "event": "tool_calls",
                                                    "data": line,
                                                }

We keep the rest of the llmchat() code without further changes and we’re done with handlers.py! Save and exit the file.

main.py

Edit main.py:

server$ vi main.py

Find the routes array and add these routes right after the route for /llmchat:

    Route('/llmtools/', handlers.llmtools, methods=['POST']),
    Route('/weather/', handlers.weather, methods=['GET']),

We’re done with main.py. Save and exit the file.

Test run

To test run your server, launch it from the command line:

server$ sudo su
# You are now root, note the command-line prompt changed from '$' or '%' to '#'.
# You can do a lot of harm with all of root's privileges, so be very careful what you do.
server# source .venv/bin/activate
(chattterd) ubuntu@server:/home/ubuntu/reactive/chatterd# granian --host 0.0.0.0 --port 443 --interface asgi --ssl-certificate /home/ubuntu/reactive/chatterd.crt --ssl-keyfile /home/ubuntu/reactive/chatterd.key --access-log --workers-kill-timeout 1 main:server
# Hit ^C to end the test
(chattterd) ubuntu@server:/home/ubuntu/reactive/chatterd# exit
# So that you're no longer root.
server$

You can test your implementation following the instructions in the Testing llmTools APIs section.


Prepared by Xin Jie ‘Joyce’ Liu, Chenglin Li, and Sugih Jamin Last updated March 8th, 2026