Analyze and understand images using multimodal vision-language models.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The multimodal vision-language model to use.
"Qwen/Qwen3-VL-Plus"
Chat-style input containing text instructions and image content.
Maximum number of tokens to generate in the response.
Sampling temperature controlling response variability.
Nucleus sampling parameter.
Whether to stream output incrementally.
Vision analysis completed successfully