Building an AI chat feature has a deceptively simple shape. There is a box where the user types, and there is an area where answers appear. But anyone who has shipped an AI chat UI past the first demo knows the difficulty lives in the seams: the moment text becomes a request, the moment tokens start streaming back, the moment a user wants to stop a runaway response. This article walks through designing a real chatbot interface in React, from the empty text box to streaming chat, and shows where the input layer and the model runtime should meet.
It helps to draw a clean line through the system. An AI chat interface is really two cooperating machines:
On the runtime side, the Vercel AI SDK has become the default in the React ecosystem. Its useChat hook manages messages, request state, and the streaming transport so you are not hand-rolling fetch readers and partial-JSON parsing. What it deliberately does not do is opinionate your input. The text box is yours to build, which is exactly where most teams either ship a bare textarea or over-invest in a document editor.
Before a single token streams, the empty composer is doing real work. It sets expectations. A good AI chat input invites the first message, suggests what the tool can do, and quietly supports the power-user features people now expect from an LLM product:
This is the layer where Prompt Area fits. It is a production-grade contentEditable input purpose-built for prompt-style and chat composers, with no ProseMirror, Slate, or Lexical underneath, just React. It gives you all of the above out of the box, and crucially, it exposes a typed data model instead of a raw string. Your value is a Segment[] array: plain runs are TextSegments, while resolved mentions and commands are ChipSegments that carry their trigger, value, and any structured data you attached. That distinction is what makes the input layer talk cleanly to the runtime layer.
Here is the part that keeps the architecture honest. Prompt Area owns the input; the AI SDK's useChat owns the stream. They do not need a tangle of shared state between them; they meet at a single line. When the user submits, you flatten the segments to plain text and hand it to sendMessage:
sendMessage({ text: segmentsToPlainText(segments) })
That is the whole contract on the happy path. The composer does not know about streaming; the chat runtime does not know about chips and triggers. Each side stays simple, and the boundary is one function call. Wiring it up looks like this:
import { useChat } from '@ai-sdk/react' import { PromptArea, usePromptAreaState, segmentsToPlainText, getChipsByTrigger } from 'prompt-area' function Chat() { const { messages, sendMessage, status, stop } = useChat() const state = usePromptAreaState() const handleSubmit = (segments) => { const text = segmentsToPlainText(segments).trim() if (!text || status === 'streaming') return // typed chips become structured context, not parsed strings const mentions = getChipsByTrigger(segments, '@').map((c) => c.value) const command = getChipsByTrigger(segments, '/')[0]?.value sendMessage({ text }, { body: { mentions, command } }) state.clear() } return ( <div> <MessageList messages={messages} /> <PromptArea state={state} onSubmit={handleSubmit} placeholder={['Ask anything…', 'Type / for commands', 'Use @ to add context']} /> <SendButton status={status} onStop={stop} /> </div> ) }
Notice what happened in that handler. Because chips are typed segments rather than substrings, you do not regex the prompt to find what the user mentioned. You read it directly with getChipsByTrigger() and pass it as structured fields in the request body, whether that is mentions, a selected command, a chosen model, or whatever your app needs, right alongside the prompt text.
This matters more than it first appears. String parsing on the server is where AI chat backends quietly accumulate bugs: a username with a space, a slash inside a code block, an @ in an email address. When the structure travels as data instead of being re-derived from text, that entire category of ambiguity disappears. The prompt stays the prompt; the context stays the context.
On the server, the body is just another input you validate. Parse it with Zod, allowlist your model IDs and commands so a client cannot ask for an arbitrary model, then hand the messages to the SDK:
import { streamText, convertToModelMessages } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import { z } from 'zod' const bodySchema = z.object({ messages: z.array(z.any()), mentions: z.array(z.string()).optional(), command: z.enum(['summarize', 'translate']).optional(), }) export async function POST(req) { const parsed = bodySchema.safeParse(await req.json()) if (!parsed.success) return new Response('Bad request', { status: 400 }) const result = streamText({ model: anthropic('claude-sonnet-4-5'), messages: convertToModelMessages(parsed.data.messages), }) return result.toUIMessageStreamResponse() }
The pipeline reads top to bottom: safeParse the body, convertToModelMessages() to normalize the history, streamText({ model }) to start generation, and toUIMessageStreamResponse() to stream UI-ready chunks back to useChat. Because the model is just an argument, the whole thing is provider-agnostic. Swap @ai-sdk/anthropic for @ai-sdk/openai or @ai-sdk/google without touching the input layer at all. Install the runtime with npm install ai @ai-sdk/react @ai-sdk/anthropic zod.
Streaming changes what the primary button should do. While a response is generating, "send" is the wrong affordance; the user wants "stop." This is where the two layers cooperate most visibly, and the AI SDK gives you exactly the signal you need: useChat exposes a status that moves through ready, submitted, streaming, and error.
Drive the action bar off that single value. When the status is streaming, the send button becomes a stop button wired to stop(); otherwise it sends. Reflecting submitted and error in the same component gives users an honest read on the system at all times.
function SendButton({ status, onStop }) { const isStreaming = status === 'streaming' || status === 'submitted' return isStreaming ? <button onClick={onStop} aria-label="Stop generating">■ Stop</button> : <button type="submit" aria-label="Send message">↑ Send</button> }
The trick to a calm AI chat UI is letting one source of truth, the SDK's status, drive every piece of the action bar, so send, stop, and disabled states can never disagree with what the runtime is actually doing.Put the pieces together and the shape of a good chatbot interface emerges: a composer that handles the messy human side of input, a runtime that handles the streaming side of generation, and a one-line seam between them. The input layer never blocks on the network; the runtime never parses prose. Each can evolve independently.
If you would rather start from a finished look than assemble the action bar yourself, Prompt Area publishes ready-made Styles at prompt-area.com/styles. These are copy-paste compositions that recreate familiar agent composers like the ChatGPT pill with its model selector, the Claude card with its coral send button and suggested-prompt chips, and a Claude Code-style composer with repo context and model and reasoning-effort selectors. They are built from the same component plus Action Bar and Status Bar companions, so they slot onto the AI SDK runtime with the same single sendMessage call.
You can also try the whole interaction live before adopting anything. The in-browser demo at prompt-area.com/docs/try-it-live is a full Vite + React app where you can type, fire slash commands, paste a screenshot, and watch the composer behave the way it will in your product.
The lasting lesson of building AI chat UIs is to resist merging the two layers. It is tempting to let the composer reach into chat state, or to let the chat runtime own the text input, because in a quick prototype they feel like one thing. They are not. Keep the input layer focused on producing clean, structured intent, meaning text plus typed chips, and keep the runtime focused on turning that intent into a stream. Connect them with sendMessage({ text: segmentsToPlainText(segments) }), drive the buttons from status, and pass context as a typed body. Do that, and the empty text box and the streaming answer end up feeling like parts of one coherent product rather than two systems duct-taped together.