Tutorial: llmChat Back End
Cover Page
You will need the HTTPS infrastructure from the llmPrompt tutorial
and the PostgreSQL database set up in the Chatter tutorial. If
you don’t have those set up, please follow the links and complete them first. You
will also need to install your self-signed certificate on your front-end platform
following the instructions in the llmPrompt tutorial for
Android or
iOS.
Install updates
Remember to install updates available to your Ubuntu back end. If N in the following notice you
see when you ssh to your back-end server is not 0,
N updates can be applied immediately.
run the following:
server$ sudo apt update
server$ sudo apt upgrade
Failure to update your packages could lead to your solution not performing at all, with no warning that it is because you haven’t updated, and also makes you vulnerable to security hacks.
Any time you see *** System restart required *** when you ssh to your server, immediately run:
server$ sync
server$ sudo reboot
Your ssh session will be ended at the server. Wait a few minutes for the system to reboot before you ssh to your server again.
appID
When a message is inserted into the chatts table, we store the role associated with it in the name
column of the table. Since you will be sharing the PostgreSQL database with the rest of the class, we
need to identify your entries so that we forward only your entries to Ollama during your “conversation”.
For that purpose, add a new column, appID, of type varchar(64) to your chatts table:
appIDplays a similar role to thesession IDused in Model Context Protocol (MCP)’s JSON-RPC 2.0 messages.
- Log into an interactive PostgreSQL (
psql) session as userpostgres - Connect to the
chatterdbdatabase - Clear your
chattstable of all oldchatts, use the SQL command:TRUNCATE TABLE chatts; - Add a new column to
chattsto store yourappIDstring:ALTER TABLE chatts ADD COLUMN appID VARCHAR(64); - To verify that you’ve added the new column to the
chattstable, enter:SELECT * FROM chatts;Make sure you get back the following result (though perhaps more stretched out):
name | message | id | time | appID | ------+----------+----+------+-------+ (0 rows)If so congratulations! You have successfully added the new column!
- Exit PostgreSQL
Drop or rename column
If you want to remove a column called “appName” from your chatts table:
ALTER TABLE chatts DROP COLUMN appName;
If you want to change a column name, for example from “appName” to “appID”:
ALTER TABLE chatts RENAME COLUMN appName to appID;
Note:
- To avoid crowding the PostgreSQL database on
mada.eecs.umich.edu, we may periodically empty the database. - Ollama handles only one connection at a time, putting all other connections
“on hold”. Try to limit your “conversations” with Ollama to be simple tests
such as “Where is Tokyo?”, followed by “And London?”, just to see that the LLM can
relate the second prompt to the first one. If you experienced long wait times
trying to interact with Ollama through
mada.eecs.umich.edu, it could be due to other classmates trying to access it at the same time.
/llmchat API
What /llmchat API does is very simple:
- insert (append) new prompt to database with prompt’s
appID, - retrieve all database entries matching
appID, in chronological order, - send retrieved entries to Ollama,
- accumulate Ollama’s completion tokens and forward them client as SSE stream,
- once completion is complete, insert accumulated completion to databse with prompt’s
appID.
Please click the relevant link to setup the /llmchat API using the web framework of your choice:
| Go | Python | Rust | TypeScript |
and return here to resume the server setup once you have your web framework set up.
Testing /llmchat and SSE error handling
As usual, you can test your llmchat API using either graphical tool such as
Postman or CLI tool such as curl. In Postman, point your POST request to
https://YOUR_SERVER_IP/llmchat, provide the Body > raw JSON content as shown
in the example. The same example using curl:
laptop$ curl -X POST -H "Content-Type: application/json" -H "Accept: event/stream" -d '{ "appID": "edu.umich.reactive.postman.llmChat", "model": "tinyllama", "messages": [ { "role": "user", "content": "Where is Tokyo?" } ], "stream": true }' -kL https://YOUR_SERVER_IP/llmchat
To test llmChat’s ability to provide context, ask a follow up question
related to your earlier prompt. For example, if you prompg, “Where is Tokyo?”,
you could prompt next, “and London?”. Even gemma3:270m should be able to
reason that the second prompt asks for the location of London (UK).
If you would like to test your /llmchat API and has made the call to the /llmprep API on your
front end but not have not implemented the /llmprep API itself on your backend (see below), you
can comment out the call to the /llmprep API on your front end and rebuild your app.
Testing SSE error handling
On your back end, in the handlers source file, in the llmchat() function, search for the
statement to insert the assistant_response into the PostgreSQL database. In the SQL statement
to INSERT the assistant_response, replace 'assistant' with NULL as the name.
(Rebuild and) restart your back end.
With your front end set up to communicate with your so modified back end at YOUR_SERVER_IP,
not mada.eecs.umich.edu, submit a prompt to Ollama. The SQL operation to insert the assitant’s
completion/reply should now fail and an SSE error event, along with its data line, should
be generated and sent to your front end. On the front end, two things should happen:
- an alert dialog box pops up with the error message, and
- the error message, prepended with
**llmChat Error**, inserted into the assistant’s text bubble. If you see both, congratulations! Your SSE event generation and handling are working!
Again, if you have made a call to
/llmprepon your front end but have not implemented it on your back end (see below), you can comment out the call to/llmprepand rebuild your front end.
TODO: /llmprep API
To give instructions to the LLM, we can simply prepend the instructions to a user prompt. With Ollama’s
chat API, we can alternatively provide such instructions to the LLM as "system" prompts. A messages
array element carrying such instruction will have its "role" set to "system", with the instructions
stored in the corresponding "content" field. Create a new API called /llmprep that allows client to
send "system" prompts to our chatterd back end. Add this new API, allowing only HTTP POST method, in
your routing table. We can call the handler for this new API llmprep().
Other than the function name, the function signature for llmprep() is the same as that of
llmchat()—except for the Rust version. For Rust, use the following function signature:
pub async fn llmprep(
State(appState): State<AppState>,
ConnectInfo(clientIP): ConnectInfo<SocketAddr>,
Json(ollamaRequest): Json<OllamaRequest>,
) -> Result<Json<Value>, (StatusCode, String)> { }
We assume that the /llmprep API is called only once at the start of a new conversation. To maintain a
manageable context window at the LLM, let the llmprep() handler first delete all existing entries for
the given appID in the PostgreSQL database. To delete all rows of the chatts table where the
appID column is of a given value (e.g., YOUR_APPID), you can use the following SQL command (note
the single quotes around YOUR_APPID):
DELETE FROM chatts WHERE appID = 'YOUR_APPID';
Then insert the system prompt into the database. You can consult the llmchat() handler if you’re
not sure how to do this.
Unlike in llmchat(), we don’t immediately send the system prompt to Ollama in llmprep().
Instead, we only insert the system prompt, with its associated appID, into the database. When
the user posts a subsequent user prompt using the /llmchat API, the llmchat() handler will
retrieve all database entries with the user’s appID to construct the context for the new prompt.
The system prompt inserted with /llmprep will be the first entry in the constructed context.
After you have inserted the system prompt in the database, simply log the /llmprep API call and
send back an empty JSON with HTTP status code 200 (Ok). You can consult the postchat() handler if
you’re not sure how to do this.
To test your /llmPrep API, run your front end with the implementation and use of llmprep()
against your own back end. Assuming the system prompt is, “Start every assistant reply with GO
BLUE!!!”, you should see “GO BLUE!!!” prepended to all responses from your backend.
That’s all we need to do to prepare the back end. Before you return to work on your front end, wrap up your work here by submitting your files to GitHub.
TIPS:
Everytime you rebuild your Go or Rust server or make changes to either of your JavaScript or Python files, you need to restart chatterd:
server$ sudo systemctl restart chatterd
If you get an HTTP error code 500 Internal Server Error or if you just don’t know whether your HTTP request has made it to the server, first thing you do is run sudo systemctl status chatterd on your server and study its output, including any error messages and debug printouts from your server.
server$ sudo systemctl status chatterd
is your BEST FRIEND in debugging your server.
Submitting your back end
We will only grade files committed to the main branch. If you use multiple branches, please merge
them all to the main branch for submission.
Navigate to your reactive folder:
server$ cd ~/reactive/
Commit changes to the local repo:
server$ git commit -am "llmchat back end"
and push your chatterd folder to the remote GitHub repo:
server$ git push
If git push failed due to changes made to the remote repo by your tutorial partner, you must run
git pull first. Then you may have to resolve any conflicts before you can git push again.
Go to the GitHub website to confirm that your back-end files have been uploaded to your GitHub repo.
WARNING: You will not get full credit if your front end is not set up to work with
your back end!
Leave your chatterd running until you have received your tutorial grade.
References
- Stream updates with server-sent events
- Server-Sent Events: A Comprehensive Guide
- How do server-sent events actually work?
- Server-sent events The spec.
- “Connection: Keep-Alive” prohibited in HTTP/2 and HTTP/3
- Understanding HTTP Streaming: A Practical Guide
- From JSON to Streaming: Building an OpenAI-Compatible Proxy for Ollama with .NET
- MCP: Streamable HTTP
| Prepared by Chenglin Li, Xin Jie ‘Joyce’ Liu, and Sugih Jamin | Last updated January 17th, 2026 |