TypeScript with Express

Cover Page

Back-end Page

handlers

Change to your chatterd directory and edit handlers.ts:

server$ cd ~/reactive/chatterd
server$ vi handlers.ts

Add the following import to the top of the file:

import { Readable } from 'stream'

Define these three types to help llmchat() deserialize JSON received from clients. Add these lines right below the import block:

type OllamaMessage = {
    role: string;
    content: string;
}

type OllamaRequest = {
    appID: string;
    model: string;
    messages: OllamaMessage[];
    stream: boolean;
}

type OllamaResponse = {
    model: string;
    message: OllamaMessage;
}

To store the client’s conversation context/history with Ollama in the PostgreSQL database, llmchat() first confirms that the client has sent an appID that can be used to tag its entries in the database. Here’s the signature of llmchat() along with its check for client’s appID:

export async function llmchat(req: Request, res: Response) {
    let ollamaRequest: OllamaRequest = req.body
    if (!ollamaRequest.appID) {
        logClientErr(res, HttpStatus.UNPROCESSABLE_ENTITY, `Invalid appID: ${ollamaRequest.appID}`)
        return
    }

    // insert into DB

}

Once we confirm that the client has an appID, we insert its current prompt into the database, adding to its conversation history with Ollama. Replace the comment // insert into DB with the following code:

    try {
        // insert each message into the database
        for (const msg of ollamaRequest.messages) {
            await chatterDB`INSERT INTO chatts (name, message, id, appid) \
                VALUES (${msg.role}, ${msg.content}, gen_random_uuid(), ${ollamaRequest.appID})`
        }
    } catch (error) {
        logServerErr(res, `${error as PostgresError}`)
        return
    }

    // retrieve history

Then we retrieve the client’s conversation history chronologically by timestamp, including the just inserted, current prompt, and put it in a JSON format expected by Ollama’s chat API. Replace // retrieve history with:

    // reconstruct ollamaRequest to be sent to Ollama:
    // - add context: retrieve all past messages by appID,
    //   incl. the one just received,
    // - convert each back to OllamaMessage, and
    // - insert it into ollamaRequest
    try {
        ollamaRequest.messages = (await chatterDB`SELECT name, message FROM chatts WHERE appid = ${ollamaRequest.appID} ORDER BY time ASC`)
            .map(row => ({
                role: row.name,
                content: row.message
            }))
    } catch (error) {
        logServerErr(res, `${error as PostgresError}`)
        return
    }

    // send request to Ollama

We create a HTTP request packet with ollamaRequest above as its payload and send it to Ollama. Then we declare an accumulator variable, completion, to accumulate the reply tokens Ollama streams back. We also prime the response stream to the client by preparing a response header to be sent with each SSE event to the client. Replace // send request to Ollama with:

    let response = await fetch(OLLAMA_BASE_URL+"/chat", {
        method: req.method,
        body: JSON.stringify(ollamaRequest),
    })
    if (!response.body) {
        logServerErr(res, "llmChat: Empty response body from Ollama")
        return
    }

    let completion = ''
    
    res.writeHead(HttpStatus.OK, {
        'Content-Type': 'text/event-stream',
        'Cache-Control': 'no-cache',
    }).flushHeaders()   

    // create a stream driven by Ollama prompt completion
    

We next create a generator function ndjson_yield_sse that will be fed by Ollama prompt completion. The fetch() API does not have a built-in method to read a response body line by line, instead the last line in a chunk of response read from the response body may be a partial line. We push such partial lines back into the read buffer (chunk variable) to be merged with the next chunk read. Upon receiving each NDJSON line from Ollama, we do two things: (1) we extract the Ollama response tokens and accumulate them in the completion variable above, and (2) we transform the line into an SSE line which we yield to Node.js’s streaming library to be forwarded to the client. With the stream defined, we instantiate a ReadableStream object around ndjson_yield_sse and give it to the same Node.js’s pipeline module we used in llmprompt(). Replace create a stream driven by Ollama prompt completion with:

    async function* ndjson_yield_sse(stream: ReadableStream<Uint8Array<ArrayBufferLike>>): AsyncGenerator<string> {
        const decoder = new TextDecoder()
        const stream_reader = stream.getReader()
        let chunk = ''
        let done = false
        let value: Uint8Array<ArrayBufferLike>
        
        while (!done) {
            ({ done, value = new Uint8Array(0) } = await stream_reader.read())

            chunk += decoder.decode(value, { stream: !done })

            const lines = chunk.split('\n') // split('\n') discards the separator
            chunk = lines.pop() || ''       // put back any partial line

            // accumulate tokens and yield data lines

        }
        if (chunk.length > 0) {
            // there's an incomplete partial line after the stream ended!
            yield `event: error\ndata: { "error": ${JSON.stringify(`Partial line at end of stream! ${chunk}`)} }\n\n`
            //yield chunk
        }

        // insert full response into database
            
    } // ndjson_yield_sse

    // Create the readable stream from the async generator
    const readableStream = Readable.from(ndjson_yield_sse(response.body as ReadableStream<Uint8Array<ArrayBufferLike>>))

    // Pipe the readable stream to the HTTP response (a writable stream)
    res.status(response.status)
    await pipeline(readableStream, res)

For each incoming NDJSON line, we convert it into an OllamaResponse type. If the conversion is unsuccessful, we return an SSE error event and move on to the next NDJSON line. Otherwise, we append the content in the OllamaResponse to the completion variable, after removing duplicated whitespaces, and yield the line as an SSE Message data line. Replace // accumulate tokens and yield data lines with:

            for (const line of lines) {
                try {
                    // deserialize each line into OllamaResponse
                    const ollamaResponse: OllamaResponse = JSON.parse(line)

                    // append response token to full assistant message
                    // replace all multiple whitespaces with single whitespace
                    completion += ollamaResponse.message.content.replace(/\s+/g, ' ')

                    // send NDJSON line as SSE data line
                    yield `data: ${line}\n\n`
                } catch (err) {
                    yield `event: error\ndata: { "error": ${err} }\n\n`
                }
            }

When we reach the end of the NDJSON stream, we insert the full Ollama response into PostgreSQL database as the assistant’s reply. It will later be sent back to Ollama as part of subsequent prompts’ context. Replace // insert full response into database with:

        if (completion) {
            // save full response to db, to form part of next prompt's history
            try {
                await chatterDB`INSERT INTO chatts (name, message, id, appid) \
                  VALUES ('assistant', ${completion}, gen_random_uuid(), ${ollamaRequest.appID})`
              		// replace 'assistant' with NULL to test error event
            } catch (err) {
                yield `event: error\ndata: { "error": ${JSON.stringify((err as PostgresError).toString())} }\n\n`
            }
        } // full response

If we encountered any error in the insertion above, we send an SSE error event to the client.

We’re done with handlers.ts. Save and exit the file.

main.ts

Edit the file main.ts:

server$ vi main.ts

Find the initialization of app and add this route right after the route for /llmprompt/:

      .post('/llmchat/', handlers.llmchat)

We’re done with main.ts. Save and exit the file.

Build and Test run

:point_right:TypeScript is a compiled language, like C/C++ and unlike JavaScript and Python, which are an interpreted languages. This means you must run npx tsgo each and every time you made changes to your code, for the changes to show up when you run node.

To build your server, transpile TypeScript into JavaScript:

server$ npx tsgo

To run your server:

server$ sudo node main.js
# Hit ^C to end the test

The cover back-end spec provides instructions on Testing llmChat API and SSE error handling.

References


Prepared by Chenglin Li, Xin Jie ‘Joyce’ Liu, and Sugih Jamin Last updated January 18th, 2026