Documentation Index
Fetch the complete documentation index at: https://docs.platform.qubrid.com/llms.txt
Use this file to discover all available pages before exploring further.
Accelerate your development with our API within a minute.
Qubrid AI simplifies the process of integrating high-performance open-source models, allowing you to run inference with just a few lines of code.
1. Register for an account
Begin by creating an account to obtain your unique API key.
Once your account is active, configure your environment by exporting your key as a variable named QUBRID_API_KEY:
export QUBRID_API_KEY=xxxxx
2. Run your first Model Inference
Select the model you wish to run. For this demonstration, we will utilize OPENAI GPT OSS 120B with streaming enabled to show real-time token generation.
import requests
import json
from pprint import pprint
url = "https://platform.qubrid.com/api/v1/qubridai/chat/completions"
headers = {
"Authorization": "Bearer <QUBRID_API_KEY>",
"Content-Type": "application/json",
}
data = {
"model": "openai/gpt-oss-120b",
"messages": [
{"role": "user", "content": "Explain quantum computing to a 5 year old."}
],
"temperature": 0.7,
"max_tokens": 4096,
"stream": False,
"top_p": 0.8,
}
response = requests.post(
url,
headers=headers,
json=data,
)
content_type = response.headers.get("Content-Type", "")
if "application/json" in content_type:
pprint(response.json())
else:
for line in response.iter_lines(decode_unicode=True):
if not line:
continue
if line.startswith("data:"):
payload = line.replace("data:", "").strip()
if payload == "[DONE]":
break
try:
chunk = json.loads(payload)
pprint(chunk)
except json.JSONDecodeError:
print("Raw chunk:", payload)
Congratulations! You have successfully run your first inference request to the Qubrid AI cloud.
Next steps