Agent Completion APIs: Picking the Right Endpoint for the Job
Overview
There are several different ways to ask a ChatBotKit agent to produce a response, and they sit at different points along two axes: how much state the platform keeps for you, and how the response is delivered. The same logical operation - "have the agent answer something" - has a stateless variant and a stateful variant, a streaming variant and a request/response variant, an interactive variant and a background variant.
The result is a small matrix of endpoints that each fit a specific shape of integration. A chat widget on a marketing site has very different requirements from a nightly summarisation job, and both have different requirements from a webhook handler that needs to react to an inbound event in under a second. Picking the right endpoint up front saves you from working around the wrong one later.
This guide walks through the conversation completion endpoints in their five common shapes, then covers the two background-job entry points that sit alongside them - task triggers and trigger integration invokes - and explains where each one fits.
The Two Axes
Before going through the endpoints individually it helps to be precise about the two axes they sit on, because every endpoint name in this guide is a point in this 2D space.
Stateless vs stateful
A stateless completion does not keep any platform-side record of the exchange. You send the full input - backstory, dataset reference, message history if you want one - and the platform returns a response. There is no conversationId. Nothing is stored after the call returns. If you want history, you maintain it yourself and replay it on the next call.
A stateful completion runs against a conversationId. The platform owns the message history, the session state, the contact link, the per-conversation metadata, and the turn-taking. You send the next user message; the platform appends it to the conversation, runs the agent against the accumulated history, appends the assistant's reply, and returns it.
Stateless is the right shape when your application is the source of truth for the conversation. Stateful is the right shape when the platform should be the source of truth - which is almost always the case for production assistants where you want the Inbox, conversation analytics, audit logs, and contact attribution to work without extra glue.
Streaming vs request/response
A streaming call returns tokens, partial messages, function calls, and the final result as they happen, over a long-lived connection. The user sees output appearing word by word; tool calls surface as they execute; the client can render progress.
A request/response call returns one JSON payload at the end. The whole completion runs server-side, and the client gets a single result. Simpler to consume, but no progress signal until it finishes.
The five conversation endpoints described in the next section all support both modes (with one exception called out below). The choice is independent of the stateless/stateful axis - every endpoint can be used either way.
The Five Conversation Endpoints
The /v1/conversation namespace exposes five different methods, each pinned to a specific point on those axes. They are all "completions" in the broad sense, but the surface area they expose is quite different.
Stateless completion
A one-shot stateless completion. You pass the full input - backstory, dataset, messages array, model parameters - and get back the assistant response. Supports both streaming and request/response.
This is the right endpoint when:
- Your application owns the conversation history and you do not want platform-side persistence.
- The interaction is genuinely one-off (a "summarise this email" call, a one-shot classification, a one-time tool invocation).
- You are integrating the platform as a model provider and the rest of your system already does conversation tracking.
What you give up by using it: nothing accumulates on the platform side. The conversation will not appear in the Inbox, will not be linked to a contact, and will not be visible in any of the conversation analytics or audit surfaces. If you later want to ask "what did this user discuss with the agent last Tuesday?" the answer is in your storage, not ours.
Stateful, append a user message
The first half of a stateful completion. It takes a user message, appends it to the conversation identified by conversationId, and returns. It does not run the agent. Supports streaming and request/response (the streaming variant exists mostly for symmetry with receive).
The reason this is split out as its own endpoint is that there are integration shapes where the act of recording a user message is decoupled from the act of producing an agent reply. A typing indicator, a multi-message burst from a user that should be batched before the agent responds, a moderation pass that runs between message storage and agent execution - all of those are easier to build when "store the message" and "produce a response" are separate operations.
Stateful, run the agent
The second half. It runs the agent against the current conversation state and returns the next assistant message. Supports streaming and request/response.
Together with send, this gives you fine-grained control over the turn cycle. You can store a user message, do something else (run a moderation pass, fan out the message to multiple downstream systems, wait for a human approval), and then ask the agent to reply. You can also call receive without a prior send if you want the agent to speak first - this is how you produce a proactive opening message at the start of a conversation.
Stateful, send and receive in one call
The combined operation. Equivalent to a send immediately followed by a receive, against the same conversationId, in a single call. Supports streaming and request/response.
This is the endpoint most chat UIs end up using, because the typical chat loop is exactly "user said something, please reply" and there is no useful work to do between the two halves. It saves a round-trip and keeps the streaming connection contiguous over the whole turn.
The split between complete and the send + receive pair is purely about whether you need to interpose logic between the user message landing and the agent running. If you do not, use complete. If you do, split it.
Stateful, fire-and-forget background completion
The same logical operation as complete, but the request returns immediately while the completion runs asynchronously on the platform. The response gives you a channelId you can subscribe to over POST /api/v1/channel/{channelId}/subscribe if you want to stream the result back later. If you do not subscribe, the completion still runs to completion and the result lands in the conversation history; the channel is purely for live progress.
There is also a stateless variant at POST /api/v1/conversation/dispatch, which has the same shape but runs against a stateless input the way /conversation/complete does.
Dispatch is the right tool when:
- The completion is going to take long enough that holding an HTTP connection open is awkward (multi-step tool use, large dataset retrieval, deep evaluations, anything with a chance of running for tens of seconds or more).
- The client may not be able to hold a connection open at all - a mobile app that backgrounds, a webhook handler that has a tight timeout, a serverless function with a hard execution limit.
- You want to fire the completion off in response to an event and let it run, with the result simply appearing in the conversation history (and in any downstream surfaces, like the Inbox) once it finishes.
The trade-off is that there is no synchronous return value. If you need the assistant's reply in the same HTTP response, dispatch is the wrong endpoint; use complete instead.
Note that dispatch does not support streaming the response back from the original request - the request returns a channelId and that is it. The streaming happens, separately, over the channel subscription.
Summary table
| Endpoint | State | Combines send + receive? | Streaming? | Request/response? | Synchronous result? |
|---|---|---|---|---|---|
/conversation/complete | stateless | n/a (full input each time) | yes | yes | yes |
/conversation/{id}/send | stateful | no - appends user message only | yes | yes | yes |
/conversation/{id}/receive | stateful | no - generates assistant message only | yes | yes | yes |
/conversation/{id}/complete | stateful | yes | yes | yes | yes |
/conversation/{id}/dispatch | stateful | yes (same input as complete) | via channel subscription | request returns channelId | no |
Decision Tree for Conversation Endpoints
A short version of how to pick:
- Does the platform need to keep the conversation history?
- No →
/conversation/complete. Done. - Yes → continue.
- No →
- Does the caller need the assistant's reply in the same HTTP response?
- No, fire-and-forget is fine →
/conversation/{id}/dispatch. - Yes → continue.
- No, fire-and-forget is fine →
- Is there work to do between recording the user's message and running the agent?
- No →
/conversation/{id}/complete. - Yes →
/conversation/{id}/send, do the in-between work, then/conversation/{id}/receive.
- No →
- Is this a proactive opening message with no user input first?
- Yes →
/conversation/{id}/receive.
- Yes →
This covers most cases. The places where it gets more interesting are when the work the agent does is part of a larger background pipeline - at which point the question stops being "which conversation endpoint" and starts being "should this even be a conversation completion at all". That is what the next section covers.
Background Jobs: Three Different Shapes
/conversation/{id}/dispatch is one way to run a completion in the background, but it is not the only background-execution surface on the platform. There are two others, and they exist because "run something asynchronously" actually covers three meaningfully different patterns:
- Background completion of a conversation. A specific conversation needs to advance, but you do not want to wait. Use
/conversation/dispatch(stateless) or/conversation/{id}/dispatch(stateful). - Run a configured task on demand. A task is a saved, named, schedulable unit of agent work - a configured bot, instructions, contact, session settings - that normally runs on a cron-like schedule. Sometimes you want to run it now. Use
/task/{id}/trigger. - Invoke a configured trigger integration on demand. A trigger integration is a webhook-style integration that is normally driven by external events. Sometimes you want to fire one manually for testing or to force execution outside of the normal event flow. Use
/integration/trigger/{id}/invoke.
The endpoints look superficially similar - all three are POST requests that return quickly while real work happens in the background - but the resources behind them are different, and so are the situations they fit.
Triggering a task
A task is a first-class platform resource. It bundles together a bot, an optional contact, instructions, session settings, optional metadata, and a schedule. Tasks are how you run agent work on a regular cadence (every hour, every weekday at 9am, on a cron expression) without writing scheduling infrastructure of your own.
/task/{id}/trigger runs an existing task immediately, outside of its schedule, with all the same configuration the scheduled run would use. The response confirms the task was queued; the actual execution happens in the background and creates a conversation in the platform exactly as a scheduled run would. The next scheduled run is unaffected.
Use this when:
- You have configured work that you want to be able to invoke on demand and on a schedule, with a single resource as the source of truth for what it does.
- You want the audit trail, the conversation history, and the per-task analytics that come with the task resource.
- You want to fire it from a webhook or an internal admin action without re-specifying what it does on every invocation.
The pattern this fits is "saved agent job, run it now". If your operation has a meaningful name, has consistent configuration, and might also need to run on a schedule one day, model it as a task and trigger it.
Invoking a trigger integration
A trigger integration is a different kind of resource. It is an event-driven entry point - a configured bot wired up behind a webhook URL - designed to receive external events (POST /api/v1/integration/trigger/{id}/event) and produce an agent response per event. Trigger integrations are typically the inbound side of an automation: an external system fires an event, the integration runs the bot against it, the result flows wherever the integration is configured to send it.
/integration/trigger/{id}/invoke is the manual counterpart. It executes the trigger integration as if an event had arrived, without an external payload. It is primarily useful for:
- Testing. Verifying that the trigger fires the right bot with the right settings, before wiring up the external event source.
- Forcing execution. Running the integration on demand - for example, from an admin tool - without setting up the external event path.
- Debugging. Reproducing trigger behaviour without external dependencies.
In production, the bulk of trigger integration traffic should go through the /event endpoint with real payloads. invoke is the convenience door for the non-event cases.
How dispatch, task trigger, and trigger invoke compare
All three are background entry points, but they differ in what the durable resource looks like.
| Aspect | /conversation/{id}/dispatch | /task/{id}/trigger | /integration/trigger/{id}/invoke |
|---|---|---|---|
| What is the configured resource? | A conversation | A task (bot + instructions + schedule) | A trigger integration (bot + webhook config) |
| Where does the configuration live? | In the request body | In the task resource | In the trigger integration resource |
| Normal trigger source | Application code, on demand | The task's schedule | External webhook events |
| Manual on-demand invocation | The endpoint is the on-demand path | /task/{id}/trigger (this endpoint) | /integration/trigger/{id}/invoke (this endpoint) |
| Result delivery | Conversation history; optional channel for live progress | Conversation history attached to the task | Conversation history attached to the trigger |
| Best for | A specific conversation that needs to advance asynchronously | Saved, schedulable agent jobs | Event-driven automation (with invoke as the manual hatch) |
The way to read this table is that the three endpoints are not interchangeable - they correspond to three different long-lived resource types, and the question is which resource type fits the work you are modelling.
Picking between them
A short heuristic:
- The work is a one-off completion against a specific conversation. Use
/conversation/{id}/dispatch. There is no benefit to creating a task or a trigger integration just to run something once. - The work has a name, consistent configuration, and might run on a schedule. Model it as a task. Use the schedule for normal runs and
/task/{id}/triggerfor on-demand runs. - The work is fundamentally event-driven - an external system needs to call into the platform with a payload. Model it as a trigger integration. Use
/eventfor production traffic and/integration/trigger/{id}/invokefor manual testing. - You are not sure yet. Start with the lightest option that works. A dispatch is cheaper to retire than a task, and a task is cheaper to retire than a trigger integration with external systems wired into it.
Worked Examples
Three short examples to illustrate the shape of the choice.
Example 1: a chat widget on a marketing site
A visitor lands on the page, opens a chat widget, types a message, sees the assistant's reply stream in. The conversation should be visible in the Inbox afterwards so the support team can review it.
- Stateful: yes, the platform should own the history.
- Background: no, the user is staring at the widget and expects the reply now.
- Send/receive split: no, every turn is "user said something, agent replies".
Use /conversation/{id}/complete with streaming enabled. On the very first turn, create the conversation; on every subsequent turn, call complete against its ID.
Example 2: a nightly summary email
Every weekday at 7am the assistant looks at the previous day's data and emails a summary to a distribution list.
- This has a name. It runs on a schedule. Its configuration (which bot, which instructions, which contact to attribute it to) is consistent across runs.
Model it as a task with a schedule of 0 7 * * 1-5. Let the schedule drive normal runs. Expose a small admin button that calls /task/{id}/trigger for the case where someone needs to regenerate the summary on demand. The endpoint that produces the actual completion is internal to the task workflow; you never call /conversation/dispatch directly for this.
Example 3: a long-running analysis triggered from a webhook
An external system fires a webhook when a document is uploaded. The platform should run the agent against the document, which can take 30–120 seconds, and post the result back to the originating system.
- Event-driven: yes, the work is initiated by an external event with a payload.
- Long-running: yes, comfortably past most webhook receiver timeouts on the originating side.
Model it as a trigger integration. The external system calls /integration/trigger/{id}/event with the document payload. The trigger runs the bot in the background and the integration's configured outbound delivery posts the result back. During development, use /integration/trigger/{id}/invoke to simulate event arrivals while the external system is being wired up.
If for some reason the trigger integration shape does not fit (for example, the inbound side is not a webhook at all but an in-application action), fall back to /conversation/dispatch from your application code. You will lose the trigger integration's configuration and event surface, but you keep the asynchronous completion semantics.
Common Pitfalls
A few patterns that tend to cause problems.
Using /conversation/complete (stateless) when you actually want the conversation to show up in the Inbox. Stateless completions do not appear in the platform's conversation surfaces. If your support team expects to be able to review what the agent said, you need a stateful conversation, not a stateless one.
Holding a streaming complete connection open for completions that take minutes. Long completions over a streaming HTTP connection are fragile - proxies time out, mobile networks drop, serverless functions hit execution limits. If a completion is reliably going to take more than 20–30 seconds, dispatch it and subscribe to the channel.
Modelling everything as a task. Tasks carry the overhead of a saved resource: they need to be created, managed, and cleaned up. A genuinely one-off completion does not need a task. Reach for /conversation/dispatch first.
Modelling event-driven work as a task that polls. If the trigger is an external event, a trigger integration is the direct fit. A task that polls an external system and runs when something has changed is a workaround for not having modelled the event source as a trigger.
Calling /integration/trigger/{id}/invoke in production traffic. invoke exists for testing and manual execution. Production traffic should go through /event so payloads land where the integration expects them. Using invoke in production loses the event payload entirely.
Forgetting that dispatch does not return the result synchronously. Code that calls dispatch and immediately tries to read a reply from the response will get a channelId and nothing else. If you need the reply in-band, use complete, not dispatch.
Summary
The conversation API exposes five completion shapes that cover the combinations of stateless vs stateful and synchronous vs background. Most chat UIs land on /conversation/{id}/complete. Stateless integrations land on /conversation/complete. The send and receive split is for the cases where you need to interpose logic between recording a user message and producing the agent reply. dispatch is the background variant, with the result delivered later through a channel subscription or through the conversation history.
Alongside those, two other endpoints cover the cases where a completion is part of a larger configured resource: /task/{id}/trigger for saved, schedulable agent jobs, and /integration/trigger/{id}/invoke for event-driven trigger integrations that you want to fire manually. They are not alternatives to dispatch - they are the on-demand entry points for two different kinds of long-lived resource.
The right endpoint is the one whose shape matches the work. A one-off completion is a dispatch or a complete. A named, scheduled, repeatable job is a task. An event-driven webhook handler is a trigger integration. Pick the resource type first, and the endpoint follows.