
MiniMax ยท Chat / LLM ยท 230B Parameters (10B Active) ยท 200K Context

Streaming Agentic Coding Long Context Code Tool Use PolyglotOverview
MiniMax M2.1 is the flagship open-source coding and agentic model from MiniMax โ a Chinese AI research company focused on building large-scale open-source foundation models for coding, reasoning, and agentic workflows. With 230B total parameters and only 10B active per token (23:1 sparsity ratio), it achieves 74% on SWE-bench Verified โ competitive with Claude Sonnet 4.5 โ at a fraction of the cost. It delivers best-in-class polyglot coding across Python, Java, Go, Rust, C++, TypeScript, and Kotlin, with a 200K context window and FP8 native quantization for production-grade efficiency. Served instantly via the Qubrid AI Serverless API.๐ป 74% SWE-bench Verified. 23:1 sparsity. Claude Sonnet 4.5-level coding at open-source cost. Deploy on Qubrid AI โ no multi-GPU cluster required.
Model Specifications
| Field | Details |
|---|---|
| Model ID | MiniMaxAI/MiniMax-M2.1 |
| Provider | MiniMax |
| Kind | Chat / LLM |
| Architecture | Sparse MoE Transformer โ 230B total / 10B active per token, FP8 quantization |
| Parameters | 230B total (10B active per forward pass) |
| Context Length | 200,000 Tokens |
| MoE | No |
| Release Date | December 2025 |
| License | Modified MIT License |
| Training Data | Large-scale multilingual code and instruction datasets across major programming languages |
| Function Calling | Not Supported |
| Image Support | N/A |
| Serverless API | Available |
| Fine-tuning | Coming Soon |
| On-demand | Coming Soon |
| State | ๐ข Ready |
Pricing
๐ณ Access via the Qubrid AI Serverless API with pay-per-token pricing. No infrastructure management required.
| Token Type | Price per 1M Tokens |
|---|---|
| Input Tokens | $0.30 |
| Input Tokens (Cached) | $0.03 |
| Output Tokens | $1.20 |
Quickstart
Prerequisites
- Create a free account at platform.qubrid.com
- Generate your API key from the API Keys section
- Replace
QUBRID_API_KEYin the code below with your actual key
๐ก Recommended parameters: Usetemperature=1.0,top_p=0.95,top_k=40for best performance โ as specified by MiniMax on the official model card.
Python
JavaScript
Go
cURL
Live Example
Prompt: Write a type-safe REST API client in TypeScript with error handling and retry logic
Response:
Playground Features
The Qubrid AI Playground lets you interact with MiniMax M2.1 directly in your browser โ no setup, no code, no cost to explore.๐ง System Prompt
Define the modelโs coding language, style, and workflow constraints before the conversation begins โ ideal for polyglot development sessions and long-horizon agentic coding workflows.Set your system prompt once in the Qubrid Playground and it applies across every turn of the conversation.
๐ฏ Few-Shot Examples
Establish your preferred code style and language conventions with concrete examples โ no fine-tuning required.| User Input | Assistant Response |
|---|---|
Write a Go function to check if a number is prime | func isPrime(n int) bool { if n < 2 { return false }; for i := 2; i*i <= n; i++ { if n%i == 0 { return false } }; return true } |
Refactor: nested for loops checking duplicates in a list | Use a hash set: seen := make(map[int]bool); for _, v := range list { if seen[v] { return true }; seen[v] = true }; return false โ O(n) vs O(nยฒ) |
๐ก Stack multiple few-shot examples in the Qubrid Playground to lock in language preference, code style, and output format โ no fine-tuning required.
Inference Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Streaming | boolean | true | Enable streaming responses for real-time output |
| Temperature | number | 1 | Recommended at 1.0 for best performance |
| Max Tokens | number | 8192 | Maximum number of tokens the model can generate |
| Top P | number | 0.95 | Controls nucleus sampling |
| Top K | number | 40 | Limits token sampling to top-k tokens |
Use Cases
- Multilingual software development
- Long-horizon agentic coding
- Code review and optimization
- Full-stack app generation
- Office automation workflows
- Complex multi-step tool use
Strengths & Limitations
| Strengths | Limitations |
|---|---|
| 74% SWE-bench Verified โ competitive with Claude Sonnet 4.5 | Less reliable than frontier closed models for deep debugging |
| 230B MoE with only 10B active โ 23:1 sparsity for extreme efficiency | Sparse activation may miss niche language idioms |
| 200K context window for full-codebase analysis | Very large model size requires multi-GPU setup for self-hosting |
| Best-in-class polyglot coding across 7 major languages | Function calling not supported via API |
| FP8 native quantization for production-grade efficiency | |
| Open weights โ fully available for local and on-premise deployment |
Why Qubrid AI?
- ๐ No infrastructure setup โ 230B MoE served serverlessly at just $0.30/1M input tokens
- ๐ OpenAI-compatible โ drop-in replacement using the same SDK, just swap the base URL
- ๐ฐ Cached input pricing โ $0.03/1M for cached tokens โ the lowest cached rate across all models on the platform
- ๐ป Polyglot by design โ MiniMax M2.1โs 7-language coding strength pairs with Qubridโs low-latency infrastructure for fast development loops
- ๐งช Built-in Playground โ prototype with system prompts and few-shot examples instantly at platform.qubrid.com
- ๐ Full observability โ API logs and usage tracking built into the Qubrid dashboard
Resources
| Resource | Link |
|---|---|
| ๐ Qubrid Docs | docs.platform.qubrid.com |
| ๐ฎ Playground | Try MiniMax M2.1 live |
| ๐ API Keys | Get your API Key |
| ๐ค Hugging Face | MiniMaxAI/MiniMax-M2.1 |
| ๐ฌ Discord | Join the Qubrid Community |
Built with โค๏ธ by Qubrid AI
Frontier models. Serverless infrastructure. Zero friction.
Frontier models. Serverless infrastructure. Zero friction.