# Qubrid Documentation ## Docs - [AI/ML Templates](https://docs.platform.qubrid.com/AI Templates.md): Standard AI/ML packages for quick deployment on high performance GPU virtual machines - [Bare Metal Servers](https://docs.platform.qubrid.com/Bare Metal.md): High-performance bare metal GPU servers for enterprise AI workloads - [Choose your Interface](https://docs.platform.qubrid.com/Choose your Interface.md): Control how you want to access your GPU Virtual Machine - [Enterprise Account](https://docs.platform.qubrid.com/Enterprise Account.md): Manage Your Team with an Organization Account - [Fine Tuning](https://docs.platform.qubrid.com/Fine Tuning.md): Fine-tune pre-deployed Text Generation & Code Generation models on the Qubrid AI platform - without writing a single line of code - [GPU Clusters](https://docs.platform.qubrid.com/GPU Clusters.md): GPU Clusters let you scale beyond a single node - [GPU Instances](https://docs.platform.qubrid.com/GPU Instances.md): Power Your AI Workloads with High-Performance GPUs - [Manage API Keys](https://docs.platform.qubrid.com/Manage API Keys.md): Creating the API Key - [Model Updates](https://docs.platform.qubrid.com/Model Updates.md) - [Platform Updates](https://docs.platform.qubrid.com/Platform Updates.md) - [MiniMax M2.1](https://docs.platform.qubrid.com/Qubrid AI/Models/MiniMaxAI/MiniMax M2.1.md) - [Qwen3 Coder Plus](https://docs.platform.qubrid.com/Qubrid AI/Models/Qwen/Qwen3 Coder Plus.md) - [Qwen3 Max](https://docs.platform.qubrid.com/Qubrid AI/Models/Qwen/Qwen3 Max.md) - [Qwen3 Next 80B A3B Thinking](https://docs.platform.qubrid.com/Qubrid AI/Models/Qwen/Qwen3 Next 80B A3B Thinking.md) - [Qwen3 Plus](https://docs.platform.qubrid.com/Qubrid AI/Models/Qwen/Qwen3 Plus.md) - [DeepSeek R1 0528](https://docs.platform.qubrid.com/Qubrid AI/Models/deepseek-ai/DeepSeek R1 0528.md) - [DeepSeek R1 Distill Llama 70B](https://docs.platform.qubrid.com/Qubrid AI/Models/deepseek-ai/DeepSeek R1 Distill Llama 70B.md) - [DeepSeek V3.2](https://docs.platform.qubrid.com/Qubrid AI/Models/deepseek-ai/DeepSeek V3.2.md) - [Llama 3.3 70B Instruct](https://docs.platform.qubrid.com/Qubrid AI/Models/meta-llama/Llama 3.3 70B Instruct.md) - [Fara 7B](https://docs.platform.qubrid.com/Qubrid AI/Models/microsoft/Fara 7B.md) - [Mistral 7B Instruct v0.3](https://docs.platform.qubrid.com/Qubrid AI/Models/mistralai/Mistral 7B Instruct v0.3.md) - [Kimi K2 Thinking](https://docs.platform.qubrid.com/Qubrid AI/Models/moonshotai/Kimi K2 Thinking.md) - [NVIDIA Nemotron 3 Nano 30B A3B BF16](https://docs.platform.qubrid.com/Qubrid AI/Models/nvidia/NVIDIA Nemotron 3 Nano 30B A3B BF16.md) - [NVIDIA Nemotron 3 Super 120B A12B FP8](https://docs.platform.qubrid.com/Qubrid AI/Models/nvidia/NVIDIA Nemotron 3 Super 120B A12B FP8.md) - [Orchestrator 8B](https://docs.platform.qubrid.com/Qubrid AI/Models/nvidia/Orchestrator 8B.md) - [GPT OSS 120B](https://docs.platform.qubrid.com/Qubrid AI/Models/openai/GPT OSS 120B.md) - [GPT OSS 20B](https://docs.platform.qubrid.com/Qubrid AI/Models/openai/GPT OSS 20B.md) - [GLM 4.7 FP8](https://docs.platform.qubrid.com/Qubrid AI/Models/zai-org/GLM 4.7 FP8.md) - [RAG](https://docs.platform.qubrid.com/RAG.md): Custom multimodal RAG pipeline combines advanced search with the best models to turn your documents, images, and audio into beautifully accurate, cited answers - just upload and we handle the rest. - [Chat Completions](https://docs.platform.qubrid.com/api-reference/chat/create.md): Generate text responses using large language models. - [Create Image](https://docs.platform.qubrid.com/api-reference/image/create.md): Generate images from text prompts. - [Extract Text](https://docs.platform.qubrid.com/api-reference/ocr/create.md): Extract text from images using OCR-capable vision models. - [Create Video](https://docs.platform.qubrid.com/api-reference/video/create.md): Generate videos from text prompts with optional image and audio inputs. - [Analyze Image](https://docs.platform.qubrid.com/api-reference/vision/create.md): Analyze and understand images using multimodal vision-language models. - [Convert Text to Speech](https://docs.platform.qubrid.com/api-reference/voice/create.md): Convert text into natural-sounding speech using AI voice models. - [Convert Speech to Text](https://docs.platform.qubrid.com/api-reference/voice/createstt.md): Convert speech audio into text using AI transcription models. - [Introduction](https://docs.platform.qubrid.com/index.md): Welcome to Qubrid Platform, the Full Stack for AI. - [Deploy Hugging Face Model](https://docs.platform.qubrid.com/inferencing/Hugging Face.md): Using a direct API and an interactive UI like Rag UI, you can now deploy and interact with models without requiring any AI expertise. - [Quickstart API](https://docs.platform.qubrid.com/inferencing/Quickstart API.md) - [Serverless Models](https://docs.platform.qubrid.com/inferencing/Serverless Models.md): Browse all available serverless models on Qubrid AI platform. - [Qwen3 TTS Flash](https://docs.platform.qubrid.com/inferencing/models-quickstart/audio/Qwen3 TTS Flash.md): > A fast, multilingual text-to-speech model with multiple expressive voices — built for real-time synthesis and interactive experiences. - [Whisper Large v3](https://docs.platform.qubrid.com/inferencing/models-quickstart/audio/Whisper Large v3.md): > Whisper Large v3 is a powerful automatic speech recognition (ASR) and speech-to-text translation model. - [DeepSeek R1 0528](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/DeepSeek R1 0528.md): > DeepSeek's frontier reasoning model — matching OpenAI o1 on AIME 2025, with chain-of-thought traces, JSON output, and function calling support. - [DeepSeek R1 Distill Llama 70B](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/DeepSeek R1 Distill Llama 70B.md): > A distilled large language model based on LLaMA 70B, optimized for efficient reasoning-focused inference. - [DeepSeek V3.2](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/DeepSeek V3.2.md): > GPT-5-class open-source model — 685B MoE with novel Sparse Attention, gold-medal IMO & IOI 2025 results, and integrated reasoning with tool use. - [Fara 7B](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Fara 7B.md): > A computer-use agent model designed to take step-by-step actions on the web using screenshots and text context. - [GLM 4.7 FP8](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/GLM 4.7 FP8.md): > A high-capacity MoE reasoning model optimized for agentic coding, complex analysis, and long-horizon tasks. - [GPT OSS 120B](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/GPT OSS 120B.md): > A high-capacity reasoning model optimized for complex analysis, advanced agent workflows, and large-scale inference. - [GPT OSS 20B](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/GPT OSS 20B.md): > A low-latency reasoning model with adjustable reasoning depth, built for local deployment and agentic workflows - [Kimi K2 Thinking](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Kimi K2 Thinking.md): > A frontier-scale MoE reasoning model built for long-context analysis, advanced coding, and agentic workflows. - [Meta Llama 3.3 70B Instruct](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Meta Llama 3.3 70B Instruct.md): > An instruction-tuned 70B-parameter language model optimized for high-quality reasoning and task-oriented inference. - [MiniMax M2.1](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/MiniMax M2.1.md): > A high-capacity coding model built for agentic workflows, polyglot development, and long-context tasks. - [Mistral 7B Instruct v0.3](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Mistral 7B Instruct v0.3.md): > A compact and efficient text model designed for fast inference, long-context handling, and easy fine-tuning. - [Nemotron 3 Nano 30B](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Nemotron 3 Nano 30B.md): > NVIDIA's hybrid MoE reasoning model — 3.3× faster inference, 262K context, SOTA on SWE-Bench and AIME 2025. - [Nemotron 3 Super 120B A12B](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Nemotron 3 Super 120B A12B.md): > NVIDIA's agentic LLM — 2.2× throughput over GPT-OSS-120B, 1M token context, 60.47% SWE-Bench Verified. - [Nvidia Orchestrator 8B](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Nvidia Orchestrator 8B.md): > An orchestration-focused model designed to coordinate tools and multiple models for complex, multi-turn agent workflows. - [Qwen3 Max](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Qwen3 Max.md): > A flagship MoE language model optimized for complex reasoning, coding, and multilingual tasks. - [Qwen3 Next 80B A3B Thinking](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Qwen3 Next 80B A3B Thinking.md): > A high-throughput reasoning model with hybrid attention, optimized for long-context analysis and complex multi-step tasks. - [Qwen3 Plus](https://docs.platform.qubrid.com/inferencing/models-quickstart/chat/Qwen3 Plus.md): > Alibaba's balanced general-purpose model for everyday chat and analysis — fast, reliable, and multilingual with up to 1M token context. - [Qwen3 Coder 30B A3B Instruct](https://docs.platform.qubrid.com/inferencing/models-quickstart/code/Qwen3 Coder 30B A3B Instruct.md): > A high-performance coding and reasoning model designed for long-context code generation and technical workflows. - [Qwen3 Coder 480B A35B Instruct](https://docs.platform.qubrid.com/inferencing/models-quickstart/code/Qwen3 Coder 480B A35B Instruct.md): > A flagship open-source coding model optimized for large-scale software engineering, complex refactoring, and agentic development workflows. - [Qwen3 Coder Flash](https://docs.platform.qubrid.com/inferencing/models-quickstart/code/Qwen3 Coder Flash.md): > A lightweight coding model optimized for real-time code completion, quick snippets, and low-latency developer tooling. - [Qwen3 Coder Next](https://docs.platform.qubrid.com/inferencing/models-quickstart/code/Qwen3 Coder Next.md): > An efficient MoE coding model designed for agentic development, repository-scale reasoning, and tool orchestration. - [Qwen3 Coder Plus](https://docs.platform.qubrid.com/inferencing/models-quickstart/code/Qwen3 Coder Plus.md): > A high-performance coding model built for autonomous programming and tool-driven development. - [Flux 2 Klein 4B](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/Flux 2 Klein 4B.md): > A lightweight image generation model optimized for fast text-to-image creation and real-time editing workflows. - [Flux Dev](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/Flux Dev.md): > A high-quality image generation model designed for photorealistic visuals, illustration, and creative workflows. - [P-Image](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/P-Image.md): > A fast, affordable text-to-image model delivering high-quality image generation in under one second. - [P-Image Edit](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/P-Image Edit.md): > A fast, affordable image editing model enabling precise text-guided modifications to existing images with high visual fidelity. - [P-Image Edit LoRA](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/P-Image Edit LoRA.md): > A fast, affordable image editing model with LoRA support for fine-tuned, text-guided modifications to existing images with precise style control. - [P-Image LoRA](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/P-Image LoRA.md): > A fast, affordable text-to-image model with LoRA support for fine-tuned style control and domain-specific image generation. - [Qwen Image Edit](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/Qwen Image Edit.md): > A 20B multimodal diffusion model for precise, text-guided image editing with strong control over both visual semantics and appearance. - [Stable Diffusion 3.5 Large](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/Stable Diffusion 3.5 Large.md): > A powerful text-to-image model designed for high-quality visuals and detailed prompt-driven generation. - [Z Image Turbo](https://docs.platform.qubrid.com/inferencing/models-quickstart/image/Z Image Turbo.md): > A high-speed, production-optimized image generation model delivering photorealistic results with low latency. - [Hunyuan OCR 1B](https://docs.platform.qubrid.com/inferencing/models-quickstart/ocr/Hunyuan OCR 1B.md): > Hunyuan OCR (1B) is an end-to-end OCR vision-language model designed for multilingual text extraction and document parsing using a single-inference workflow. - [P-Video](https://docs.platform.qubrid.com/inferencing/models-quickstart/video/P-Video.md): > A premium AI video generation model supporting text-to-video, image-to-video, and audio-conditioned workflows with cinematic-quality outputs. - [Kimi K2.5](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Kimi K2.5.md): > A large-scale MoE model designed for multimodal reasoning, visual analysis, and coordinated agent-style workflows. - [Qwen3 VL 235B A22B Instruct](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3 VL 235B A22B Instruct.md): > A large-scale vision-language model designed for advanced visual reasoning, document analysis, and multimodal interaction. - [Qwen3 VL 235B A22B Thinking](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3 VL 235B A22B Thinking.md): > A reasoning-focused vision-language model optimized for deep visual analysis, scientific problem solving, and multimodal agent workflows. - [Qwen3 VL 30B A3B Instruct](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3 VL 30B A3B Instruct.md): > A mid-scale vision-language model built for long-context image understanding, document text extraction, and agent-style multimodal workflows. - [Qwen3 VL 8B Instruct](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3 VL 8B Instruct.md): > A vision-language instruction-tuned model for multimodal understanding, OCR, and visual reasoning - [Qwen3 VL Flash](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3 VL Flash.md): > A lightweight vision-language model optimized for real-time image analysis, document reading, and low-latency visual applications. - [Qwen3 VL Plus](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3 VL Plus.md): > A production-ready vision-language model designed for image understanding, document text extraction, and visual question answering. - [Qwen3.5 122B A10B](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3.5 122B A10B.md): > A large-scale MoE model with 122B parameters designed for advanced reasoning, multimodal analysis, and tool-integrated agent workflows. - [Qwen3.5 27B](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3.5 27B.md): > A dense multimodal transformer with 27B parameters, designed for coding, long-context reasoning, and native text–image–video interaction. - [Qwen3.5 35B A3B](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3.5 35B A3B.md): > A highly efficient Mixture-of-Experts model with 35B parameters and 3B active per token, optimized for multimodal reasoning and cost-efficient deployment. - [Qwen3.5 397B A17B](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3.5 397B A17B.md): > Alibaba's flagship native multimodal model with 397B parameters, trained on text, image, and video for frontier-level reasoning and multimodal interaction. - [Qwen3.5 Flash](https://docs.platform.qubrid.com/inferencing/models-quickstart/vision/Qwen3.5 Flash.md): > A low-latency hosted multimodal model designed for high-throughput applications, long-context reasoning, and integrated tool usage. - [Quickstart](https://docs.platform.qubrid.com/quickstart.md): Follow these steps to get started with Qubrid Platform: ## OpenAPI Specs - [openapistt](https://docs.platform.qubrid.com/api-reference/voice/openapistt.json) - [openapi](https://docs.platform.qubrid.com/api-reference/openapi.json) ## Optional - [Community](https://discord.com/invite/Btsqxa6ZnQ) - [Blogs](https://qubrid.com/blog-news)