←back to changelog

New Models from Arcee AI, OpenAI, Z.AI, and StepFun

ChatBotKit expands its model library with four new AI models - Arcee AI Trinity Large, OpenAI GPT-OSS 120B, Z.AI GLM-4.5 Air, and StepFun Step 3.5 Flash - including free-tier variants to help you build and explore without cost.

We are thrilled to announce that ChatBotKit has expanded its model library with four new additions spanning multiple frontier providers: Arcee AI Trinity Large, OpenAI GPT-OSS 120B, Z.AI GLM-4.5 Air, and StepFun Step 3.5 Flash. All four models are immediately accessible on the platform, and each comes with a free-tier variant so you can explore their capabilities without any upfront cost.

Arcee AI Trinity Large

Trinity Large Preview is Arcee AI's flagship open-weight model built on a 400-billion-parameter sparse Mixture-of-Experts architecture, activating just 13 billion parameters per token through a 4-of-256 expert routing strategy. This design delivers frontier-scale capability with exceptional efficiency, making Trinity Large a standout choice for creative writing, storytelling, role-play, conversational AI, and real-time voice assistance. Beyond creative work, it handles complex agentic toolchains and long constraint-filled prompts with confidence, offering a versatile foundation for both expressive and technical ChatBotKit applications.

OpenAI GPT-OSS 120B

GPT-OSS 120B is OpenAI's open-weight, 117-billion-parameter Mixture-of-Experts language model designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1 billion parameters per forward pass and is optimized for efficient deployment, including support for native MXFP4 quantization on a single H100 GPU. The model supports configurable reasoning depth with full chain-of-thought access, native function calling, and structured output generation - making it an excellent choice for building sophisticated reasoning agents and tool-augmented workflows in ChatBotKit Skillsets and Blueprints.

Z.AI GLM-4.5 Air

GLM-4.5 Air is the lightweight variant of Z.AI's latest flagship model family, purpose-built for agent-centric applications. Sharing the Mixture-of-Experts architecture of GLM-4.5 in a more compact form, GLM-4.5 Air supports hybrid inference modes: a thinking mode for advanced reasoning and complex tool use, and a non-thinking mode for low-latency real-time interaction. This flexibility makes it well suited for ChatBotKit deployments that need to balance response speed with analytical depth depending on the nature of the user request.

StepFun Step 3.5 Flash

Step 3.5 Flash is StepFun's most capable open-source foundation model, built on a sparse Mixture-of-Experts architecture with 196 billion total parameters and only 11 billion activated per token. Engineered as a reasoning model with exceptional speed efficiency even at long contexts, Step 3.5 Flash supports a 256,000-token context window alongside function calling and chain-of-thought reasoning. It is a compelling option for applications that require sustained reasoning at scale while keeping latency competitive.

All four models - along with their free-tier counterparts - are now available in the ChatBotKit model picker. You can select any of them from your dashboard or reference them via the API.