back to tutorials

How to Safely Deploy an AI Agent Across Your Enterprise

A complete, end-to-end guide to standing up an AI agent for a whole organisation without losing control. Build the agent with guardrails baked in, wrap it in usage policies that cap cost automatically, and hand it out through scoped tokens for developers and portals for non-technical teams.

Building a capable AI agent is the easy part. The hard part is what happens next: ten teams want to use it, half of them can't write a line of code, someone embeds the API key in a mobile app, and three weeks later Finance asks why the model bill quadrupled overnight.

This tutorial walks through a safe, repeatable way to expose a single agent to an entire organisation. It is built in three layers, and each layer closes a specific risk:

  1. Create the agent with guardrails baked into the bot itself
  2. Add cost and usage policies so spend is capped and runaway loops are caught automatically
  3. Control access with scoped tokens for developers and portals for everyone else

By the end you will have an agent that developer and non-developer teams can use freely, while you keep a hard ceiling on cost, a tight boundary on behaviour, and a narrow surface for access.

Note: This tutorial uses the SDK so every step is scriptable and copy-pasteable, but you can build the exact same agent entirely in the dashboard UI. The bot and its guardrails, usage policies, scoped tokens, and the portal all have visual equivalents in the dashboard. Use whichever fits your workflow.

What You'll Learn

  • Create a bot with grounding, moderation, PII protection, and cost-aware model settings
  • Attach usage policies that alert you and automatically pause a bot when it crosses a token, message, or conversation threshold
  • Add a retention policy so conversation data expires on a schedule for compliance
  • Issue scoped API tokens that can only perform one narrow operation
  • Publish a branded portal so non-technical teams can use the agent with no token at all

Prerequisites

  • A ChatBotKit account with dashboard access
  • An API secret for the SDK examples (create one on the Tokens page)
  • Node.js 18+ if you want to follow the SDK snippets

Install the SDK if you plan to script any of this:

Part 1: Create the Agent With Guardrails Baked In

A safe deployment starts with a well-bounded agent. Every guardrail you set here is one you don't have to enforce downstream. You can do all of this in the dashboard under Create Bot, or via the API as shown below.

Set a bounded backstory

The backstory is your primary behavioural guardrail. State what the agent does, and just as importantly, state what it must refuse. A tight scope keeps the agent on-task and reduces the surface for prompt-injection and off-topic abuse.

Ground the agent to reduce hallucination

Attach one or more datasets so the agent answers from your verified content instead of the model's training data. ChatBotKit uses retrieval-augmented generation (RAG) to pull relevant passages into each response, which is the single most effective way to reduce hallucination in an enterprise setting.

Turn on moderation and privacy

Two toggles matter for a broad rollout:

  • Moderation scans both incoming user messages and outgoing bot responses, and automatically blocks harmful or inappropriate content with a refusal.
  • Privacy detects and anonymises personally identifiable information (names, emails, phone numbers, addresses) before it reaches the model, which helps you stay aligned with GDPR and CCPA.

Set cost-aware model settings

Cost control begins at the model, before any policy is involved. Three settings compound:

  • Model choice - a capable model like glm-5.2 for quality, or a mini/nano variant when a task is simple. Smaller models cut per-token cost dramatically.
  • Max context window - cap the tokens the model considers per iteration so a single turn can't balloon.
  • Interaction messages - limit how many prior messages are sent as context. Q&A-style agents work well at 4-10 messages instead of the default 100, which cuts tokens per iteration significantly.

See Cost Optimisation Strategies for Token Consumption for the full breakdown of these settings.

Create the bot

Here is the equivalent bot creation via the SDK, with the guardrails set at creation time:

Keep the bot private for now. You will expose it deliberately in Part 3, not by default.

Part 2: Wrap It in Cost and Usage Policies

The bot settings limit cost per interaction. A usage policy limits cost in aggregate, and it is the guardrail that actually protects the invoice. Each policy watches a single metric against a threshold over a rolling time window, and fires an action the moment the threshold is crossed. Because counting happens at the usage-recording layer, enforcement is immediate.

A usage policy config has four parts:

  • metric - tokens, messages, or conversations
  • threshold - the count that trips the policy
  • windowInSeconds - the rolling window the count is measured over
  • actions - email (notify), block (pause the bot), or both. At least one is required.

A monthly ceiling for billing peace of mind

This policy pauses the bot and emails you if it burns more than 5 million tokens in any rolling 30-day window - a backstop against a bad month, not day-to-day traffic:

A tighter per-hour cap to catch abuse

A slow monthly ceiling won't catch a script gone wild or a stuck loop. Add a second, tighter policy on a short window so sudden spikes are stopped in minutes:

Alerts are deduplicated to once per window, so a sustained breach sends a single heads-up rather than an email on every event.

Global vs. per-bot scope

The examples above pass botId, so they govern one bot. Omit botId and the policy becomes global - it applies to every bot in the account from a single rule. Both a bot's own policies and the account-wide ones are evaluated together on each usage event, so a common pattern is:

  • One global notify-only policy as an account-wide early-warning system
  • A per-bot block policy on each production agent for hard enforcement

A notify-only global policy is just a policy with email and no block:

A retention policy for data lifecycle

Cost isn't the only thing that accumulates - conversation data does too. A retention policy automatically expires idle conversations after a set number of days, which helps with compliance and data minimisation:

You can create and manage all of these in the dashboard under Resources → Policies as well as through the API.

Part 3: Control How the Agent Is Accessed

You now have a bounded agent with hard cost ceilings. The last layer decides who can reach it and how. Enterprises have two very different audiences, and each gets a different mechanism.

Developers and integrations: scoped tokens

A standard API token has unrestricted access to your entire account. Never hand one of those to an application or a team. Instead, issue a scoped token whose allowedRoutes restrict it to exactly the operations it needs.

The most common pattern for exposing an agent is a token that can only create sessions for one specific bot - safe to place in a backend that faces the internet:

A token with this configuration can start conversations with your Operations Assistant and nothing else. It cannot list bots, read other conversations, touch datasets, or change policies. If it is used against any other endpoint, the API rejects it with 403 Forbidden.

Your backend then mints a short-lived session for the browser or app:

Give each consuming team its own scoped token. Tokens are independent, so you can rotate or revoke one team's access without disrupting anyone else. For the full glob syntax and the pre-built token templates, see How to Create Scoped API Tokens for Restricted Access.

Non-technical teams: portals

Most people in an enterprise will never touch a token. For them, publish a portal - a standalone, branded workspace where selected users access selected apps at a URL like {slug}.chatbotkit.agency.

To roll the Operations Assistant out to the whole company:

  1. Go to the Portals page and click Create Portal, or start from the Team Chat Portal template.
  2. Give it a name and a unique slug, e.g. acme-ops.
  3. Enable the Chat app and select your Operations Assistant as the available bot. Add a few starter prompts like "How do I reset my VPN?" to guide first-time users.
  4. Under Users, control access with email matchers - one per line:
    • *@acme.com grants access to everyone with a company email
    • contractor@partner.io adds a specific external collaborator
  5. Under Layout, set your logo, sidebar title, and footer links so it feels like an internal Acme tool.
  6. Save and share the URL.

Anyone whose sign-in email matches a pattern gets in; everyone else is denied. There is no token to leak, no code to write, and the portal only exposes the chat interface for the one bot you selected - not the dashboard, not other bots, not administrative functions.

You can create purpose-built portals per audience: a chat-only portal for general staff, and a separate portal with the Inbox and Usage apps for the team that monitors the agent.

Putting It Together

Each layer defends against a different failure, and together they let you say "yes" to broad access without losing control:

LayerMechanismRisk it closes
AgentBounded backstory, datasets, moderation, privacyOff-topic use, hallucination, harmful content, PII leaks
Model settingsModel choice, context window, message capRunaway cost per interaction
Usage policiesToken/message/conversation thresholds with block + alertRunaway cost in aggregate, abuse, stuck loops
Retention policyAutomatic conversation expiryData piling up, compliance drift
Scoped tokensallowedRoutes restricted to one operationA leaked key exposing the whole account
PortalsEmail-matched access to a single chat appNon-technical staff needing safe, no-code access

Start narrow. Keep the bot private, set conservative thresholds, and expose it to one team through one channel first. Widen access as you watch real usage in the analytics dashboard, tightening or loosening policies as you learn.

Next Steps