Chuyển tới nội dung chính

🧠 Gemini Model Guide

Orchable supports a wide range of Gemini models, each optimized for different stages of the orchestration pipeline.

🚀 Model Comparison

Model	Primary Use Case	Context Window	Key Features
Gemini 3 Pro	Highest reasoning, complex coding	1M+ tokens	State-of-the-art reasoning, SOTA vibe-coding
Gemini 3 Flash	Balanced speed & intelligence	1M+ tokens	Pro-grade reasoning at Flash speed
Gemini 2.5 Pro	Deep STEM, math, long context	1M+ tokens	Optimized for analytical depth
Gemini 2.5 Flash	Production workhorse	1M+ tokens	Best price-performance; ultra-reliable
Gemini 2.5 Flash Lite	High-volume micro-tasks	1M+ tokens	Fastest, most affordable
Gemini 2.0 Flash	Legacy support / Stable Gen	1M tokens	Mature, stable performance

🛠️ Capability Matrix

Capability	Gemini 3 Series	Gemini 2.5 Series	Gemini 2.0 Series
Structured Output (JSON)	✅	✅	✅
Thinking / CoT	✅ (Multi-level)	✅ (Budget)	✅ (Exp)
Context Caching	✅	✅	✅
Code Execution	✅	✅	✅
Multimodal (Audio/Video)	✅	✅	✅
Live API (Streaming)	❌	✅	✅

📍 Choosing a Model

Planning Stage: Use Gemini 3 Flash or Gemini 3 Pro for high-quality orchestrations.
Processing Stage: Use Gemini 2.5 Flash for reliable, consistent output.
Filtering Stage: Use Gemini 2.5 Flash Lite to save cost and time.
RAG Stage: Use Gemini Embedding 1 for all vector retrieval tasks.

Last Updated: 2026-02-24

🚀 Model Comparison
🛠️ Capability Matrix
📍 Choosing a Model