TypeScript with Express
Cover Page
Back-end Page
handlers
Change to your chatterd directory and edit handlers.ts:
server$ cd ~/reactive/chatterd
server$ vi handlers.ts
Add the following import to the top of the file:
import { Readable } from 'stream'
Define these three types to help llmchat() deserialize
JSON received from clients. Add these lines right below the import
block:
type OllamaMessage = {
role: string;
content: string;
}
type OllamaRequest = {
appID: string;
model: string;
messages: OllamaMessage[];
stream: boolean;
}
type OllamaResponse = {
model: string;
message: OllamaMessage;
}
To store the client’s conversation context/history with Ollama in the PostgreSQL
database, llmchat() first confirms that the client has sent an appID that can
be used to tag its entries in the database. Here’s the signature of llmchat()
along with its check for client’s appID:
export async function llmchat(req: Request, res: Response) {
let ollamaRequest: OllamaRequest = req.body
if (!ollamaRequest.appID) {
logClientErr(res, HttpStatus.UNPROCESSABLE_ENTITY, `Invalid appID: ${ollamaRequest.appID}`)
return
}
// insert into DB
}
Once we confirm that the client has an appID, we insert its current prompt
into the database, adding to its conversation history with Ollama. Replace the
comment // insert into DB with the following code:
try {
// insert each message into the database
for (const msg of ollamaRequest.messages) {
await chatterDB`INSERT INTO chatts (name, message, id, appid) \
VALUES (${msg.role}, ${msg.content}, gen_random_uuid(), ${ollamaRequest.appID})`
}
} catch (error) {
logServerErr(res, `${error as PostgresError}`)
return
}
// retrieve history
Then we retrieve the client’s conversation history chronologically by timestamp, including the just inserted, current prompt, and put it in a JSON format expected by Ollama’s chat API. Replace // retrieve history with:
// reconstruct ollamaRequest to be sent to Ollama:
// - add context: retrieve all past messages by appID,
// incl. the one just received,
// - convert each back to OllamaMessage, and
// - insert it into ollamaRequest
try {
ollamaRequest.messages = (await chatterDB`SELECT name, message FROM chatts WHERE appid = ${ollamaRequest.appID} ORDER BY time ASC`)
.map(row => ({
role: row.name,
content: row.message
}))
} catch (error) {
logServerErr(res, `${error as PostgresError}`)
return
}
// send request to Ollama
We create a HTTP request packet with ollamaRequest above as its payload and send it to Ollama.
Then we declare an accumulator variable, completion, to accumulate the reply tokens
Ollama streams back. We also prime the response stream to the client by preparing a response
header to be sent with each SSE event to the client. Replace // send request to Ollama with:
let response = await fetch(OLLAMA_BASE_URL+"/chat", {
method: req.method,
body: JSON.stringify(ollamaRequest),
})
if (!response.body) {
logServerErr(res, "llmChat: Empty response body from Ollama")
return
}
let completion = ''
res.writeHead(HttpStatus.OK, {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
}).flushHeaders()
// create a stream driven by Ollama prompt completion
We next create a generator function ndjson_yield_sse that will be fed by Ollama prompt completion.
The fetch() API does not have a built-in method to read a response body line by line, instead the
last line in a chunk of response read from the response body may be a partial line. We push such
partial lines back into the read buffer (chunk variable) to be merged with the next chunk read.
Upon receiving each NDJSON line from Ollama, we do two things: (1) we extract the Ollama response
tokens and accumulate them in the completion variable above, and (2) we transform the line into an
SSE line which we yield to Node.js’s streaming library to be forwarded to the client. With the
stream defined, we instantiate a ReadableStream object around ndjson_yield_sse and give it to
the same Node.js’s pipeline module we used in llmprompt(). Replace create a stream driven by
Ollama prompt completion with:
async function* ndjson_yield_sse(stream: ReadableStream<Uint8Array<ArrayBufferLike>>): AsyncGenerator<string> {
const decoder = new TextDecoder()
const stream_reader = stream.getReader()
let chunk = ''
let done = false
let value: Uint8Array<ArrayBufferLike>
while (!done) {
({ done, value = new Uint8Array(0) } = await stream_reader.read())
chunk += decoder.decode(value, { stream: !done })
const lines = chunk.split('\n') // split('\n') discards the separator
chunk = lines.pop() || '' // put back any partial line
// accumulate tokens and yield data lines
}
if (chunk.length > 0) {
// there's an incomplete partial line after the stream ended!
yield `event: error\ndata: { "error": ${JSON.stringify(`Partial line at end of stream! ${chunk}`)} }\n\n`
//yield chunk
}
// insert full response into database
} // ndjson_yield_sse
// Create the readable stream from the async generator
const readableStream = Readable.from(ndjson_yield_sse(response.body as ReadableStream<Uint8Array<ArrayBufferLike>>))
// Pipe the readable stream to the HTTP response (a writable stream)
res.status(response.status)
await pipeline(readableStream, res)
For each incoming NDJSON line, we convert it into an OllamaResponse type. If the conversion is
unsuccessful, we return an SSE error event and move on to the next NDJSON line. Otherwise, we append
the content in the OllamaResponse to the completion variable, after removing duplicated
whitespaces, and yield the line as an SSE Message data line. Replace // accumulate tokens and
yield data lines with:
for (const line of lines) {
try {
// deserialize each line into OllamaResponse
const ollamaResponse: OllamaResponse = JSON.parse(line)
// append response token to full assistant message
// replace all multiple whitespaces with single whitespace
completion += ollamaResponse.message.content.replace(/\s+/g, ' ')
// send NDJSON line as SSE data line
yield `data: ${line}\n\n`
} catch (err) {
yield `event: error\ndata: { "error": ${err} }\n\n`
}
}
When we reach the end of the NDJSON stream, we insert the full Ollama response into PostgreSQL
database as the assistant’s reply. It will later be sent back to Ollama as part of subsequent
prompts’ context. Replace // insert full response into database with:
if (completion) {
// save full response to db, to form part of next prompt's history
try {
await chatterDB`INSERT INTO chatts (name, message, id, appid) \
VALUES ('assistant', ${completion}, gen_random_uuid(), ${ollamaRequest.appID})`
// replace 'assistant' with NULL to test error event
} catch (err) {
yield `event: error\ndata: { "error": ${JSON.stringify((err as PostgresError).toString())} }\n\n`
}
} // full response
If we encountered any error in the insertion above, we send an SSE error event to the client.
We’re done with handlers.ts. Save and exit the file.
main.ts
Edit the file main.ts:
server$ vi main.ts
Find the initialization of app and add this route right
after the route for /llmprompt/:
.post('/llmchat/', handlers.llmchat)
We’re done with main.ts. Save and exit the file.
Build and Test run
![]()
TypeScript is a compiled language, like C/C++ and unlike JavaScript and Python, which are an interpreted languages. This means you must run npx tsgo each and every time you made changes to your code, for the changes to show up when you run node.
To build your server, transpile TypeScript into JavaScript:
server$ npx tsgo
To run your server:
server$ sudo node main.js
# Hit ^C to end the test
The cover back-end spec provides instructions on Testing llmChat API and SSE error handling.
References
| Prepared by Chenglin Li, Xin Jie ‘Joyce’ Liu, and Sugih Jamin | Last updated January 18th, 2026 |