Skip to main content

About the Provider

Qwen is an AI model family developed by Alibaba Group, a major Chinese technology and cloud computing company. Through its Qwen initiative, Alibaba builds and open-sources advanced language , images and coding models under permissive licenses to support innovation, developer tooling, and scalable AI integration across applications

Model Quickstart

This section helps you quickly get started with the Qwen/Qwen3-Coder-30B-A3B-Instruct model on the Qubrid AI inferencing platform. To use this model, you need:
  • A valid Qubrid API key
  • Access to the Qubrid inference API
  • Basic knowledge of making API requests in your preferred language
Once authenticated with your API key, you can send inference requests to the Qwen/Qwen3-Coder-30B-A3B-Instruct model and receive responses based on your input prompts. Below are example placeholders showing how the model can be accessed using different programming environments.
You can choose the one that best fits your workflow.
import requests
import json
from pprint import pprint

url = "https://platform.qubrid.com/api/v1/qubridai/chat/completions"
headers = {
"Authorization": "Bearer <Qubrid_API_KEY>",
"Content-Type": "application/json"
}

data = {
"model": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
"messages": [
  {
    "role": "user",
    "content": "Write a Python function to calculate fibonacci sequence"
  }
],
"temperature": 0.7,
"max_tokens": 4096,
"stream": False
}
response = requests.post(
  url,
  headers=headers,
  json=data, 
)
content_type = response.headers.get("Content-Type", "")
if "application/json" in content_type:
  pprint(response.json())
else:
  for line in response.iter_lines(decode_unicode=True):
      if not line:
          continue

      if line.startswith("data:"):
          payload = line.replace("data:", "").strip()

          if payload == "[DONE]":
              break
          try:
              chunk = json.loads(payload)
              pprint(chunk)
          except json.JSONDecodeError:
              print("Raw chunk:", payload)
This will produce a response similar to the one below:
{
    "content": "```python\ndef fibonacci_dp(n):\n    \"\"\"\n    Calculate the nth Fibonacci number using dynamic programming.\n    \n    The Fibonacci sequence is defined as:\n  F(0) = 0, F(1) = 1, and F(n) = F(n-1) + F(n-2) for n > 1\n    \n    This implementation uses dynamic programming with O(n) time complexity\n    and O(1) space complexity by storing only the last two values.\n  \n  Args:\n   n (int): The position in the Fibonacci sequence (non-negative integer)\n  \n   Returns:\n        int: The nth Fibonacci number\n        \n    Raises:\n        ValueError: If n is negative\n    \n    Examples:\n    >>> fibonacci_dp(0)\n        0\n        >>> fibonacci_dp(1)\n        1\n        >>> fibonacci_dp(5)\n    5\n   >>> fibonacci_dp(10)\n        55\n    \"\"\"\n    # Input validation\n    if n < 0:\n    raise ValueError(\"n must be a non-negative integer\")\n    \n    # Base cases\n    if n == 0:\n        return 0\n    elif n == 1:\n     return 1\n    \n    # Dynamic programming approach\n    # Only store the last two Fibonacci numbers\n    prev_prev = 0  # F(i-2)\n    prev = 1       # F(i-1)\n    \n    # Calculate from F(2) to F(n)\n    for i in range(2, n + 1):\n        current = prev_prev + prev\n        prev_prev = prev\n        prev = current\n  \n return prev\n\n\ndef fibonacci_sequence_dp(n):\n    \"\"\"\n    Generate the first n+1 Fibonacci numbers using dynamic programming.\n    \n    Args:\n        n (int): The upper index of the sequence (non-negative integer)\n  \n    Returns:\n        list: A list containing the first n+1 Fibonacci numbers\n        \n    Examples:\n   >>> fibonacci_sequence_dp(5)\n        [0, 1, 1, 2, 3, 5]\n        >>> fibonacci_sequence_dp(0)\n        [0]\n    \"\"\"\n    if n < 0:\n        raise ValueError(\"n must be a non-negative integer\")\n    \n    if n == 0:\n        return [0]\n    \n    # Initialize the sequence\n    fib_sequence = [0] * (n + 1)\n    fib_sequence[0] = 0\n    \n    if n >= 1:\n        fib_sequence[1] = 1\n    \n    # Fill the sequence using dynamic programming\n    for i in range(2, n + 1):\n        fib_sequence[i] = fib_sequence[i-1] + fib_sequence[i-2]\n    \n    return fib_sequence\n\n\n# Alternative implementation with memoization\ndef fibonacci_memo(n, memo=None):\n    \"\"\"\n    Calculate the nth Fibonacci number using memoization (top-down DP).\n    \n    Args:\n        n (int): The position in the Fibonacci sequence\n  memo (dict): Memoization dictionary (optional)\n        \n    Returns:\n int: The nth Fibonacci number\n    \"\"\"\n    if memo is None:\n        memo = {}\n    \n    if n < 0:\n        raise ValueError(\"n must be a non-negative integer\")\n    \n    if n in memo:\n        return memo[n]\n    \n    if n == 0:\n        return 0\n    elif n == 1:\n  return 1\n    \n    memo[n] = fibonacci_memo(n-1, memo) + fibonacci_memo(n-2, memo)\n    return memo[n]\n\n\n# Example usage and testing\nif __name__ == \"__main__\":\n    # Test the basic function\n    print(\"Fibonacci numbers:\")\n    for i in range(11):\n  print(f\"F({i}) = {fibonacci_dp(i)}\")\n    \n    print(\"\\nFirst 10 Fibonacci numbers as a sequence:\")\n    print(fibonacci_sequence_dp(9))\n    \n    # Performance comparison example\n    import time\n    \n    n = 35\n    \n    # Dynamic programming approach\n    start_time = time.time()\n    result_dp = fibonacci_dp(n)\n    dp_time = time.time() - start_time\n    \n    # Memoization approach\n    start_time = time.time()\n    result_memo = fibonacci_memo(n)\n    memo_time = time.time() - start_time\n    \n    print(f\"\\nPerformance comparison for F({n}):\")\n    print(f\"Dynamic Programming: {result_dp} (Time: {dp_time:.6f}s)\")\n    print(f\"Memoization: {result_memo} (Time: {memo_time:.6f}s)\")\n```\n\nThis implementation provides:\n\n1. **Main Function (`fibonacci_dp`)**: \n   - Uses bottom-up dynamic programming\n   - Time complexity: O(n), Space complexity: O(1)\n   - Efficient for calculating single Fibonacci numbers\n\n2. **Sequence Generator (`fibonacci_sequence_dp`)**:\n  - Returns the entire sequence up to index n\n   - Useful when you need multiple Fibonacci numbers\n\n3. **Alternative Implementation (`fibonacci_memo`)**:\n   - Top-down approach with memoization\n   - Good for understanding different DP approaches\n\nKey features:\n- **Input validation** with clear error messages\n- **Comprehensive documentation** with docstrings and examples\n- **Edge case handling** (n=0, n=1)\n- **Efficient space usage** in the main implementation\n- **Well-tested examples** and performance comparison\n\nThe dynamic programming approach avoids the exponential time complexity of naive recursion by storing previously computed values and building up the solution iteratively.",
    "metrics": {
        "input_tokens": 44,
        "output_tokens": 534,
        "total_time": 9.0186,
        "tps": 59.2118
    },
    "model": "Qwen/Qwen3-Coder-30B-A3B-Instruct",
    "usage": {
        "completion_tokens": 1065,
        "prompt_tokens": 44,
        "prompt_tokens_details": null,
        "total_tokens": 1109
    }
}

Model Overview

Qwen3 Coder 30B A3B is a large causal language model designed for code generation and technical reasoning. It belongs to the latest generation of the Qwen model family and supports both thinking mode for complex reasoning and non-thinking mode for efficient general usage within the same model. The model is built using a Mixture-of-Experts (MoE) architecture, activating only a subset of parameters per request to balance performance and efficiency. It is trained through both pretraining and post-training stages and supports long context lengths for complex coding and reasoning workflows.

Model at a Glance

FeatureDetails
Model IDQwen/Qwen3-Coder-30B-A3B-Instruct
ProviderQwen
Model TypeCausal Language Model
ArchitectureMixture-of-Experts (MoE) Transformer, 48 layers, GQA attention, 128 experts (8 active per forward pass)
Model Size1.1B Params
Parameters4B

When to use?

You should consider using Qwen3 Coder 30B A3B if:
  • Your application focuses on code generation or technical reasoning
  • You need long context support for large codebases or complex prompts
  • You want a model that can switch between deep reasoning and efficient responses
  • Your workflow includes agent-based tasks with external tools
  • You require multilingual support for technical or coding tasks

Inference Parameters

Parameter NameTypeDefaultDescription
StreamingbooleantrueEnable streaming responses for real-time output.
Temperaturenumber0.7Controls randomness; higher values produce more diverse, less deterministic output.
Max Tokensnumber65536Maximum tokens to generate in the response, suitable for long-form code or large refactors.
Top Pnumber0.8Nucleus sampling controlling token sampling diversity.

Key Features

  • Supports thinking mode for complex reasoning, mathematics, and coding
  • Supports non-thinking mode for efficient general-purpose dialogue
  • Strong performance in code generation, technical reasoning, and logical tasks
  • Designed for agent workflows with tool integration
  • Supports multilingual instruction following and translation

Best Practices

Sampling Settings

Thinking Mode

(enable_thinking = true) :
  • Temperature: 0.6
  • Top-P: 0.95
  • Top-K: 20
  • Min-P: 0
Avoid greedy decoding to prevent repetition and degraded performance.

Non-Thinking Mode

(enable_thinking = false) :
  • Temperature: 0.7
  • Top-P: 0.8
  • Top-K: 20
  • Min-P: 0

Output Length

  • Recommended output length: 32,768 tokens
  • For highly complex math or programming problems: 38,912 tokens

Prompt Standardization

Math Problems

Include the following instruction:
Please reason step by step, and put your final answer within \boxed{}.
Multi-Turn Conversations
  • Historical responses should include only the final output
  • Thinking content should not be stored in conversation history
  • This behavior is handled automatically in the provided Jinja2 chat template

Summary

Qwen3 Coder 30B A3B is a Mixture-of-Experts language model. It is optimized for code generation and technical reasoning tasks. The model supports both thinking and non-thinking modes in a single deployment. It provides long context support up to 131K tokens with YaRN. Designed for efficient, multilingual, and agent-based inferencing workflows.