Hunyuan OCR (1B) (Tencent)

About the Provider

Tencent is a major Chinese technology company and cloud services provider that develops AI models and research technologies through its Hunyuan AI initiative. The company focuses on creating advanced open-source and commercial AI systems—including vision-language, OCR, and foundation models—to support developers, enterprises, and real-world applications across industries.

Model Quickstart

This section helps you quickly get started with the tencent/HunyuanOCR model on the Qubrid AI inferencing platform. To use this model, you need:

A valid Qubrid API key
Access to the Qubrid inference API
Basic knowledge of making API requests in your preferred language

Once authenticated with your API key, you can send inference requests to the tencent/HunyuanOCR model and receive responses based on your input prompts. Below are example placeholders showing how the model can be accessed using different programming environments.
You can choose the one that best fits your workflow.

import requests
import json
from pprint import pprint

url = "https://platform.qubrid.com/api/v1/qubridai/chat/completions"

headers = {
  "Authorization": "Bearer Qubrid_API_KEY",
  "Content-Type": "application/json"
}
data = {
  "model": "tencent/HunyuanOCR",
  "messages": [
      {
          "role": "user",
          "content": [
              {
                  "type": "image_url",
                  "image_url": {
                      "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
                  }
              }
          ]
      }
  ],
  "max_tokens": 4096,
  "temperature": 0,
  "language": "auto",
  "ocr_mode": "general",
  "stream": False  
}
response = requests.post(
  url,
  headers=headers,
  json=data,
  stream=True  
)
content_type = response.headers.get("Content-Type", "")
if "application/json" in content_type:
  pprint(response.json())
else:
  for line in response.iter_lines(decode_unicode=True):
      if not line:
          continue
      if line.startswith("data:"):
          payload = line.replace("data:", "").strip()
          if payload == "[DONE]":
              break
          try:
              chunk = json.loads(payload)
              pprint(chunk)
          except json.JSONDecodeError:
              print("Raw chunk:", payload)

Model Overview

Hunyuan OCR (1B) is an end-to-end OCR-focused vision-language model built on Hunyuan’s native multimodal architecture.

It is designed to perform text extraction and document understanding tasks using a single instruction and a single inference step.
With a lightweight 1B parameter size, the model supports multilingual document parsing and multiple OCR-related tasks while remaining efficient for deployment on inferencing platforms.
The model is focused purely on OCR workflows and is not intended for general visual question answering.

Model at a Glance

Feature	Details
Model ID	`tencent/HunyuanOCR`
Provider	Tencent
Parameters	1B
Context Length	16k tokens
Model Type	OCR-focused Vision-Language Model

When to use?

You should consider using Hunyuan OCR (1B) if:

You need an OCR-specific model rather than a general-purpose vision model
Your application involves document parsing, text spotting, or subtitle extraction
You work with multilingual or mixed-language content
You prefer an end-to-end OCR model instead of cascading OCR systems
You require a lightweight model optimized for efficient inference

Do not use this model for general visual question answering tasks.

Key Features

Efficient Lightweight Architecture : Built on Hunyuan’s native multimodal architecture, achieving strong OCR performance with only 1B parameters and reduced deployment cost.
Comprehensive OCR Coverage : Supports text detection, text recognition, complex document parsing, open-field information extraction, video subtitle extraction, photo translation, and document QA within a single model.
End-to-End Inference Workflow : Designed around a single-instruction, single-inference approach, avoiding multi-stage OCR pipelines and cascade errors.
Multilingual Document Support : Provides robust support for over 100 languages, including mixed-language documents and varied document types.

Inference Parameters

Parameter Name	Type	Default	Description
Streaming	boolean	true	Enable streaming responses for real-time output.
Language	select	en	Optional language hint to improve OCR accuracy for specific languages.
OCR Mode	select	general	Select optimized OCR mode based on the image type.
Max Output Tokens	number	4096	Maximum number of tokens for the generated text.
Temperature	number	0	Controls randomness. Keep at 0 for accurate text extraction.

Performance Characteristics

Strengths

Lightweight 1B parameter model with strong OCR accuracy
Native handling of high-resolution images and extreme aspect ratios
Unified end-to-end architecture without bounding-box error propagation
Effective recognition of rotated and vertical text
Strong multilingual and mixed-script support

Considerations

Designed specifically for OCR, not general visual reasoning
May hallucinate on extremely blurred or low-resolution text
Throughput depends on visual token density

Summary

Hunyuan OCR (1B) is a lightweight, OCR-focused vision-language model developed by Tencent Hunyuan.

It performs end-to-end OCR tasks using a single instruction and single inference step.
The model supports multilingual and mixed-language document parsing across images and videos.
It is optimized for efficient deployment with a 1B parameter size and fp16 quantization.
The model is best suited for OCR pipelines rather than general-purpose vision tasks.

Getting started

GPU Compute

Inferencing

AI Tools

About the Provider

Model Quickstart

Model Overview

Model at a Glance

When to use?

Key Features

Inference Parameters

Performance Characteristics

Strengths

Considerations

Summary

Getting started

GPU Compute

Inferencing

AI Tools

​About the Provider

​Model Quickstart

​Model Overview

​Model at a Glance

​When to use?

​Key Features

​Inference Parameters

​Performance Characteristics

​Strengths

​Considerations

​Summary

About the Provider

Model Quickstart

Model Overview

Model at a Glance

When to use?

Key Features

Inference Parameters

Performance Characteristics

Strengths

Considerations

Summary