Introducing MiMo-V2 Models from Xiaomi
We are excited to announce the addition of three new models from the Xiaomi MiMo-V2 family to ChatBotKit: MiMo-V2-Pro, MiMo-V2-Omni, and MiMo-V2-Flash. Together they cover the full spectrum from flagship agentic reasoning to multimodal perception and high-throughput cost-efficient inference, giving you maximum flexibility when designing conversational AI workflows.
MiMo-V2-Pro
MiMo-V2-Pro is Xiaomi's flagship foundation model featuring over one trillion total parameters and a 1M-token context window, deeply optimized for agentic scenarios. It ranks among the global top tier on standard benchmarks with perceived performance approaching that of Claude Opus 4.6. Designed to serve as the brain of agent systems, MiMo-V2-Pro excels at orchestrating complex workflows, driving production engineering tasks, and delivering results reliably across extended conversations. It is an excellent choice for ChatBotKit Blueprints and Skillsets that demand deep reasoning and long-context awareness.
MiMo-V2-Omni
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability including visual grounding, multi-step planning, tool use, and code execution, all within a 256K context window. If your application needs to analyze visual content, process audio, or reason across modalities in a single turn, MiMo-V2-Omni delivers a powerful all-in-one solution.
MiMo-V2-Flash
MiMo-V2-Flash is an open-source Mixture-of-Experts language model with 309 billion total parameters and 15 billion active parameters. It supports a hybrid-thinking toggle and a 256K context window, excelling at reasoning, coding, and agent scenarios. On SWE-bench Verified and SWE-bench Multilingual, MiMo-V2-Flash ranks as the top open-source model globally, delivering performance comparable to Claude Sonnet 4.5 while costing only about 3.5% as much. It is the ideal choice for cost-sensitive deployments that still require strong reasoning and coding ability.
Model Comparison
| Feature | MiMo-V2-Pro | MiMo-V2-Omni | MiMo-V2-Flash |
|---|---|---|---|
| Context | 1M tokens | 256K tokens | 256K tokens |
| Input Price | $1.00/M | $0.40/M | $0.09/M |
| Output Price | $3.00/M | $2.00/M | $0.29/M |
| Strengths | Flagship reasoning, agentic orchestration | Multimodal (image, audio, video) | Cost-efficient reasoning and coding |
All three models are available now in the ChatBotKit model picker. You can select any of them from your dashboard or reference them via the API.