Tutorial: llmChat
DUE Wed, 10/1, 2 pm
This tutorial may be completed individually or in teams of at most 2. You can partner differently for each tutorial.
Treat your messages sent to chatterd and Ollama as public utterances
with no reasonable expectation of privacy and know that these
are recorded for the purposes of carrying out a contextual interaction
with Ollama.
Objectives
Front end:
- To handle SSE stream
Back end:
- To store and forward user interactions with LLM
- To convert NDJSON stream to SSE stream
Prompt context and server-sent event streaming
When interacting with a user, LLMs are usually memoryless. Each prompt is standalone. If you want to keep a continuing back-and-forth “conversation” with an LLM, you must send a transcript (“context”) of your on-going “conversation” up to the current point. In this tutorial, we store each prompt from the user and the reply (“completion”) from the LLM (“assistant”) in our backend PostgreSQL database. With each new prompt from the user, we send the full content, in chronological order, over to the LLM.
While Ollama uses NDJSON to stream its reply, commercial LLMs, such as ChatGPT, and tool use protocol, such as MCP, often use server-sent event (SSE) to stream replies. The streaming part of MCP’s new “Streaming HTTP” protocol also uses SSE to stream. The advantage of SSE over NDJSON is that you can have multiple streams interleaved into one connection. In this tutorial, we see how error messages can be sent alongside normal messages, and tagged as such, so that the front end could identify and handle them separately. We will have another use of stream interleaving in Project 2 and subsequently.

chatterd serving as SSE proxy and providing context for Ollama
API and protocol handshakes
In this tutorial, we add one new API to Chatter:
llmchat: uses HTTP POST to post user’s prompt for Ollama’schatAPI as a JSON Object and to receive Ollama’s response as an HTTP SSE stream.
Using this syntax:
url-endpoint
-> request: data sent to Server
<- response: data sent to Client
The protocol handshakes consist of:
/llmchat
-> HTTP POST { appID, model, messages, stream }
<- { SSE event-stream lines } 200 OK
Data formats
To post a prompt to Ollama with the llmchat API, the front-end client sends
a JSON Object consisting of an appID field, to uniquely identify this
client device for PostgreSQL database sharing, to store the user’s prompt context;
a model field, for the LLM model we want Ollama to use, a messages field for the
prompt itself (more details below), and a stream field to indicate
whether we want Ollama to stream its response or to batch and send it in one
message. For example:
{
"appID": "edu.umich.reactive.postman",
"model": "tinyllama",
"messages": [
{ "role": "user", "content": "I live in Ann Arbor" }
],
"stream": true
}
The messages field is a JSON Array with one or more JSON Object as its
element. In the example above there is only one JSON Object in the messages array. Each
element of messages consists of a role field, which can be "system", to
give instructions (“prompt engineering”) to the model, "user", to carry user’s
prompt, "assistant", to indicate reply (“prompt completion”) from the model, etc.
The content field hold the actual instruction, prompt, reply, etc. from the entity
listed in role.
Remember to separate each element with a comma if you have more than one elements in
the array.
The SSE stream returned by chatterd will look something like this if you use curl:
data: {"model":"tinyllama","created_at":"2025-08-08T20:35:52.586157Z","message":{"role":"assistant","content":"Ab"},"done":false}
data: {"model":"tinyllama","created_at":"2025-08-08T20:35:52.603345Z","message":{"role":"assistant","content":"sol"},"done":false}
data: {"model":"tinyllama","created_at":"2025-08-08T20:35:52.620774Z","message":{"role":"assistant","content":"utely"},"done":false}
[. . . .]
data: {"model":"tinyllama","created_at":"2025-08-08T20:35:53.814197Z","message":{"role":"assistant","content":"!"},"done":false}
data: {"model":"tinyllama","created_at":"2025-08-08T20:35:53.832383Z","message":{"role":"assistant","content":""},"done_reason":"stop","done":true,"total_duration":1554222000,"load_duration":24007209,"prompt_eval_count":581,"prompt_eval_duration":272983000,"eval_count":71,"eval_duration":1247892000}
On Postman, the data tag will not be shown, only the data lines:

On error, an error event will be sent:
event: error
data: {"error": "<error message>"}
Specifications
As in previous tutorials, you only need to build one front end, AND one of the alternative back-end stacks. Until you have a working backend of your own, you can use mada.eecs.umich.edu to test your front end. To receive full credit, your front end MUST work with your own back end.
The balance of work in this tutorial is more heavily weighted towards the backend:
| Prepared by Sugih Jamin | Last updated: August 25th, 2025 |