Skip to main content

About the Provider

DeepSeek is a Chinese artificial intelligence company based in Hangzhou, Zhejiang that focuses on research and development of large language models and advanced AI technologies. The firm emphasizes open innovation in AI, publishing models and research under permissive licenses to make powerful language models widely accessible and support collaborative development in the global AI community.

Model Quickstart

This section helps you quickly get started with the deepseek-ai/deepseek-r1-distill-llama-70b model on the Qubrid AI inferencing platform. To use this model, you need:
  • A valid Qubrid API key
  • Access to the Qubrid inference API
  • Basic knowledge of making API requests in your preferred language
Once authenticated with your API key, you can send inference requests to the deepseek-ai/deepseek-r1-distill-llama-70b model and receive responses based on your input prompts. Below are example placeholders showing how the model can be accessed using different programming environments.
You can choose the one that best fits your workflow.
from openai import OpenAI

# Initialize the OpenAI client with Qubrid base URL
client = OpenAI(
  base_url="https://platform.qubrid.com/v1",
  api_key="QUBRID_API_KEY",
)

# Create a streaming chat completion
stream = client.chat.completions.create(
  model="deepseek-ai/deepseek-r1-distill-llama-70b",
  messages=[
    {
      "role": "user",
      "content": "Explain quantum computing in simple terms"
    }
  ],
  max_tokens=10000,
  temperature=0.3,
  top_p=1,
  stream=True
)

# If stream = False comment this out
for chunk in stream:
  if chunk.choices and chunk.choices[0].delta.content:
      print(chunk.choices[0].delta.content, end="", flush=True)
print("\n")

# If stream = True comment this out
print(stream.choices[0].message.content)

Model Overview

DeepSeek R1 Distill Llama 70B is a distilled large language model optimized for efficient, high-level reasoning and conversational intelligence. It is trained by distilling high-quality reasoning outputs from DeepSeek-R1 into a 70B LLaMA-based architecture, delivering near frontier-level analytical performance while running on significantly smaller hardware compared to full-scale models.

Model at a Glance

FeatureDetails
Model IDdeepseek-ai/deepseek-r1-distill-llama-70b
ArchitectureLLaMA-3.1-70B (Distilled)
Model Size70B parameters
Parameters6
Training DataDistilled from DeepSeek R1 high-quality reasoning outputs with LLaMA 70B
Context Length64K tokens

When to use?

Use DeepSeek R1 Distill Llama 70B if you need:
  • Strong reasoning and chain-of-thought capabilities for complex tasks
  • Long-context support up to 64K tokens
  • Efficient deployment compared to full, non-distilled frontier models
  • Open-source licensing suitable for on-premise or custom deployments
  • Reliable performance across math, logic, coding, and research workflows

Inference Parameters

Parameter NameTypeDefaultDescription
StreamingbooleantrueEnable streaming responses for real-time output.
Temperaturenumber0.3Controls creativity and randomness; higher values produce more diverse output.
Max Tokensnumber10000Defines the maximum number of tokens the model is allowed to generate.
Top Pnumber1Nucleus sampling that limits token selection to a subset of top probability mass.
Reasoning EffortselectmediumAdjusts the depth of reasoning and problem-solving effort; higher values increase response quality at the cost of latency.
Reasoning SummaryselectautoControls verbosity of reasoning explanations: auto, concise, or detailed

Key Features

  1. High-Quality Reasoning: Optimized for strong reasoning and chain-of-thought capabilities, suitable for complex tasks.
  2. Long-Context Support: Can handle up to 64K tokens, enabling processing of very large inputs.
  3. Efficient Deployment: Distilled model runs efficiently compared to full 70B models, reducing hardware requirements.
  4. Configurable Inference: Supports adjustable parameters like temperature, streaming, reasoning effort, and verbosity for flexible and precise outputs.

Summary

DeepSeek R1 Distill Llama 70B brings high-quality reasoning capabilities into a more accessible 70B-parameter distilled model. It supports long-context reasoning, configurable inference settings, and open deployment under an MIT license. The model is well suited for advanced reasoning, technical assistance, and research use cases where efficiency and accessibility are priorities.