AI Tools

RAG

Custom multimodal RAG pipeline combines advanced search with the best models to turn your documents, images, and audio into beautifully accurate, cited answers - just upload and we handle the rest.

RAG is your intelligent AI research assistant that combines powerful language models with real-time document retrieval to give context-aware, grounded answers. Imagine chatting with an expert who not only understands your question but also instantly pulls precise information from your documents - that’s RAG. It’s privacy-first, lightning-fast, beautifully formatted, and model flexible - ideal for exploring and understanding your content interactively.

RAG is still in BETA, if you encounter issues let us know. Its currently free for users to try

How to use RAG?

Head over to the RAG tab from the left menu

This will open up the 2 Step RAG UI, ready to be used

Select the Files

You can either click to select or even drag & drop files in the window to get started

File types: .pdf, .txt, .doc, .docx, .md, .png, .jpg, .jpeg, .mp3, .wav are supported & Total size of all files uploaded can be upto 16 MB

Click the Upload button

This will open up the 2 Step RAG UI, ready to be used

Head over to the Chat UI

Now you can ask whatever questions you might have, the answers will be from the reference documents uploaded only

If you want to delete the files, simply press the ❌ button beside the documents. To clear the chat & start a new conversation you can use the Clear Chat button

The answers provided by the RAG Agent will also have citations & will mention the PDF source for you to crosscheck if needed

Why Use Multimodal RAG?

Accuracy Across Formats: Reduces hallucinations by grounding models in trusted enterprise data, not just text.
Custom Context: Aligns responses with your organization’s knowledge base, images, videos, or audio transcripts.
Freshness: Works with frequently updated multimodal datasets (e.g., meeting recordings, surveillance images, reports).
Multimodal Use Cases: Go beyond text-only AI; integrate visual, audio, and structured data reasoning.

Supported Modalities

Text: Documents, FAQs, manuals, knowledge bases.
Images: Diagrams, charts, product images.
PDFs: Reports, invoices, academic papers.
Audio: Meeting recordings, call center conversations.
Video: Training videos, tutorials, surveillance streams. Coming Soon
Structured Data: CSVs, relational tables, analytics exports. Coming Soon

Example Scenarios

Customer Support: Retrieve text FAQs + product images to answer queries.
Compliance & Legal: Retrieve legal PDFs + annotated charts for audits.
Healthcare: Ground AI in radiology images + doctor’s notes.
Research: Combine academic papers (PDFs) with charts/images from experiments.
Enterprise Training: Use video transcripts + slides to answer employee questions.

PreviousFine Tuning NextOverview