Cover Page

Backend Page

Python with Starlette

Add dependencies

Change to your chatterd folder and add:

server$ cd ~/reactive/chatterd
server$ uv add dataclasses_json sse_starlette

handlers

Edit handlers.py:

server$ vi handlers.py

First modify the following imports at the top of the file:

from dataclasses_json import dataclass_json
import json
import re
from sse_starlette.sse import EventSourceResponse

replace your from typing line with:

from typing import List, Optional    

Next define these three classes to help llmchat() deserialize JSON received from clients. Add these lines right below the import block:

@dataclass_json
@dataclass
class OllamaMessage:
    role: str
    content: str

@dataclass_json
@dataclass
class OllamaRequest:
    appID: str
    model: str
    messages: List[OllamaMessage]
    stream: bool

@dataclass_json
@dataclass
class OllamaResponse:
    model: str
    created_at: str
    message: OllamaMessage

To store the client’s conversation context/history with Ollama in the PostgreSQL database, llmchat() first confirms that the client has sent an appID that can be used to tag its entries in the database. Here’s the signature of llmchat(). The from_json deserialization will throw an exception if appID is absent: HTTP error if it is absent:

async def llmchat(request):
    try:
        ollama_url = f"{OLLAMA_BASE_URL}/chat"
        method = request.method
        ollama_request = OllamaRequest.from_json(await request.body())

        # insert into DB

Once we confirm that the client has an appID, we insert its current prompt into the database, adding to its conversation history with Ollama. Replace the comment # insert into DB with the following code:

        if ollama_request.messages:
            async with main.server.pool.connection() as conn:
                async with conn.cursor() as cur:
                    for msg in ollama_request.messages:
                        await cur.execute(
                            'INSERT INTO chatts (username, message, id, appID) VALUES (%s, %s, gen_random_uuid(), %s);',
                            (msg.role, msg.content, ollama_request.appID) # preserve prompt formatting
                        )

        # retrieve history

Then we retrieve the client’s conversation history, including the recently inserted, current prompt, as the last entry, and put them in a JSON format expected by Ollama’s chat API. Replace # retrieve history with:

        async with main.server.pool.connection() as conn:
            async with conn.cursor() as cur:
                await cur.execute('SELECT username, message FROM chatts WHERE appID = %s ORDER BY time ASC;', 
                (ollama_request.appID, ))
                rows = await cur.fetchall()
                history = [OllamaMessage(role=row[0], content=row[1]) for row in rows]

        ollama_request.messages = history
        payload = ollama_request.to_json().encode("utf-8")

        # send request to Ollama

In an event_generator, we first declare an accumulator variable, full_response, to assemble the reply tokens Ollama streams back, and send the request we constructed earlier to Ollama. As we saw in the first tutorial, llmPrompt, Ollama streams the replies as a NDJSON stream. We transform this NDJSON stream into a stream of SSE events. Replace # send request to Ollama with:

        async def event_generator():
            full_response = []
            try:
                async with asyncClient.stream(
                    method=method,
                    url=ollama_url,
                    content=payload,
                ) as response:
                    # process NDJSON elements and yield Python dictionary

For each incoming NDJSON element, we convert it into an OllamaResponse type. Then we append the content in the OllamaResponse to the full_response variable. Then we yield the full NDJSON line in a Python dictionary with data as the key. EventSourceResponse() will convert this dictionary into an SSE event and send it to the client. Replace # process NDJSON elements and yield Python dictionary with:

                    async for line in response.aiter_lines():
                        try:
                            if line:
                                # line breaks ('\n') not included in resulting line
                                ollama_response = OllamaResponse.from_json(line)
                                content = ollama_response.message.content
                                full_response.append(content)

                                yield {
                                    "data": line
                                }
                        except Exception as err:
                            yield {
                                "event": "error",
                                "data": f'{{"error": {json.dumps(str(err))}}}'
                            }

                # insert full response into database

When we reach the end of the NDJSON stream, we insert the full Ollama response into PostgreSQL database as the assistant’s reply. Replace # insert full response into database with:

                assistant_response = "".join(full_response)

                # insert ollama response to database to form part of next prompt's history
                async with main.server.pool.connection() as conn:
                    async with conn.cursor() as cur:
                        await cur.execute(
                            'INSERT INTO chatts (username, message, id, appID) VALUES (%s, %s, gen_random_uuid(), %s);', 
                            ("assistant", re.sub(r"\s+", " ", assistant_response), ollama_request.appID) 
                            # replace 'assistant' with None to test SSE error event
                            # replace all multiple whitespaces with single whitespace
                        )
                        
            except Exception as err:
                yield {
                    "event": "error",
                    "data": f'{{"error": {json.dumps(str(err))}}}'
                }

        # use event_generator

The assistant_response line should line up with the call to async with asyncClient.stream() above, that is, we insert to the database only upon receiving the full NDJSON stream. If we encountered any error in the insertion above, we yield an SSE error event.

Finally, we use the event_generator defined above by passing it to EventSourceResponse() that converts each generated Python dictionary into an SSE event and sends it to the user. Replace # use event_generator comment with:

        return EventSourceResponse(event_generator())

    except Exception as err:
        return JSONResponse({"error": f'{type(err).__name__}: {str(err)}'}, status_code=500)

The final exception handler covers the whole body of the llmchat() function.

We’re done with handlers.py. Save and exit the file.

main.py

Edit main.py`:

server$ vi main.py

Find the routes array and add this route right after the route for /llmprompt:

    Route('/llmchat', handlers.llmchat, methods=['POST']),

We’re done with main.py. Save and exit the file.

Test run

To test run your server, launch it from the command line:

server$ sudo su
# You are now root, note the command-line prompt changed from '$' or '%' to '#'.
# You can do a lot of harm with all of root's privileges, so be very careful what you do.
server# source .venv/bin/activate
(chattterd) ubuntu@server:/home/ubuntu/reactive/chatterd# granian --host 0.0.0.0 --port 443 --interface asgi --ssl-certificate /home/ubuntu/reactive/chatterd.crt --ssl-keyfile /home/ubuntu/reactive/chatterd.key --access-log --workers-kill-timeout 1 main:server
# Hit ^C to end the test
(chattterd) ubuntu@server:/home/ubuntu/reactive/chatterd# exit
# So that you're no longer root.
server$

The cover backend spec provides instructions on Testing llmChat API and SSE error handling.

References


Prepared by Chenglin Li, Xin Jie ‘Joyce’ Liu, and Sugih Jamin Last updated August 10th, 2025