🆓 Free Model News

Gemini 3.5 Flash Is Now Free: What You Get at Google I/O 2026

Google's fastest frontier model just became the default for every free user — here's what changed, what the limits are, and whether it's worth switching from ChatGPT.

By Free AI News Editorial · · · 8 min read

Quick Answer: Google launched Gemini 3.5 Flash at I/O on May 19, 2026, and made it the free default model inside the Gemini app and Google Search AI Mode globally. No subscription needed. Developers get 60 free requests per minute in Google AI Studio. The model outperforms the previous Gemini 3.1 Pro on coding and agentic tasks while running four times faster.

Every year, Google uses I/O to reset what "free AI" means. In 2026, that reset was Gemini 3.5 Flash — a model that not only replaces the previous generation's flagship on performance benchmarks, but does so at zero cost to everyday users. If you opened the Gemini app or used Search AI Mode after May 19, you were already running it without knowing. For developers, free access via Google AI Studio at 60 requests per minute represents one of the most generous no-credit-card-required API tiers in the market. This article breaks down everything that changed, what you can actually do with it, and where the catches are.

Gemini 3.5 Flash was announced alongside more than 100 product updates at Google I/O 2026, but the free tier rollout is arguably the most impactful change for the average user. This is the first time a "frontier-class" Gemini model — one that competes directly with ChatGPT and Claude on benchmark leaderboards — has been freely available to anyone without a Gemini Advanced subscription.

Glowing AI neural network interface on a dark background representing Google Gemini's advanced language model
Gemini 3.5 Flash became Google's free default frontier model at I/O 2026 — available to all users in the Gemini app and Search AI Mode. Photo: Unsplash

What exactly is Gemini 3.5 Flash and why did Google release it for free?

Gemini 3.5 Flash is the first model in Google's new 3.5 family, which the company describes as combining "frontier intelligence with action" — meaning it was purpose-built for agentic tasks, long-horizon reasoning, and coding, not just one-shot question answering. According to Google's official Gemini 3.5 announcement on blog.google, 3.5 Flash is now the default model for both the Gemini app and AI Mode in Search, globally. That single line is what makes this a free-tier story: the default is free, and the default is now frontier-class.

Why give it away? The same competitive logic that pushed OpenAI to put o3-mini on the free tier earlier this year. Google is in a race for daily active users, and models that only paid subscribers can access don't become the default AI for billions of people. By making 3.5 Flash the free standard, Google ensures that when users ask Google Search a complex question, they get a response from a model that genuinely competes with paid alternatives — reinforcing the habit of staying inside Google's ecosystem rather than switching to ChatGPT or Claude. You can learn more about how these competitive dynamics affect free tier availability on our Free Tier Tracker.

What do you actually get on the Gemini 3.5 Flash free tier?

The free tier is split into two distinct contexts: consumer apps and developer API access. Here's what each one offers:

What's not included on the free tier: access to Gemini Omni Flash (the video-generation multimodal sibling model) is currently limited to paid Gemini subscribers and select YouTube tools, with API access rolling out in the coming weeks. Gemini 3.5 Pro — the more powerful sibling — is still in testing as of late May 2026 and will be a paid-tier model when it launches.

Humanoid robot representing AI model benchmarking and comparison between Gemini, Claude, and ChatGPT
How does Gemini 3.5 Flash stack up against Claude Sonnet 4.6 and ChatGPT on real benchmarks? Photo: Unsplash

How does Gemini 3.5 Flash perform compared to ChatGPT and Claude?

This is the question that matters most for anyone deciding whether to switch or supplement. Based on benchmarks circulating post-I/O, here's where Gemini 3.5 Flash lands:

Benchmark Gemini 3.5 Flash Claude Sonnet 4.6 GPT-5.5
MCP Atlas (agentic) 83.6% 79.1%
Terminal-Bench 2.1 76.2%
SWE-Bench Verified (coding) 79.6%
Speed (tokens/sec) 289
API input cost (per 1M tokens) $1.50 $3.00 Higher

The picture is nuanced. Gemini 3.5 Flash is the clear winner for agentic and long-horizon tasks — the MCP Atlas score is meaningful because it measures multi-step tool use, which is increasingly how people use AI assistants. It also runs at 289 tokens per second, roughly 4x faster than GPT-5.5 or Opus-tier models, which matters a lot when you're building apps or running batch workflows. For pure software engineering, Claude Sonnet 4.6 still holds the SWE-Bench lead. See our full model comparison guide for a deeper breakdown across more categories.

The pricing gap is equally notable. At $1.50 per million input tokens, Gemini 3.5 Flash costs exactly half what Claude Sonnet 4.6 charges. For high-volume API users — think content pipelines, customer service bots, or data processing — that's not a rounding error; it's a budget-halving decision with essentially no performance tradeoff for general workloads. The Verge's I/O 2026 roundup highlighted this pricing pressure as one of the most consequential shifts of the conference.

Who should actually switch to Gemini 3.5 Flash as their primary free AI tool?

Not everyone. The answer depends heavily on what you're using AI for day-to-day. Here are the clearest use-case fits and non-fits:

Strong fit — switch or add it to your rotation:

Weaker fit — probably stay where you are:

Browse our full directory of free AI tools to compare Gemini 3.5 Flash alongside the other no-cost options available right now.

What are the real limits and catches you need to know about?

Every free tier has a catch. Here's an honest accounting of Gemini 3.5 Flash's:

For real-time updates on when these limits change — or when a new model tier drops — bookmark our AI Free Tier Tracker, which we update whenever a major provider makes a change.

🔑 Key Takeaways

  • Gemini 3.5 Flash launched on May 19, 2026 at Google I/O and is now the free default model in the Gemini app and Google Search AI Mode globally — no subscription required.
  • The model runs at 289 tokens per second and scores 76.2% on Terminal-Bench 2.1, outperforming Gemini 3.1 Pro on coding and agentic tasks while being four times faster than comparable frontier models.
  • Developers get 60 free requests per minute via Google AI Studio with no credit card needed, making it the most accessible flagship-class API free tier from a major US provider in 2026.
  • Gemini 3.5 Flash leads Claude Sonnet 4.6 on agentic benchmarks (MCP Atlas: 83.6% vs 79.1%) and costs half as much via API ($1.50 vs $3.00 per million input tokens), making it the top cost-efficiency pick for general workloads.
  • Key limits to watch: January 2026 knowledge cutoff, 60 req/min API cap on the free tier, Omni Flash video API not yet fully open, and Gemini 3.5 Pro (a paid upgrade) expected in June 2026.

Frequently Asked Questions

Is Gemini 3.5 Flash really free to use?

Yes. As of May 19, 2026, Gemini 3.5 Flash is the default model in the free tier of both the Gemini app and Google Search AI Mode. No subscription is required. Developers also get 60 free requests per minute through Google AI Studio with no credit card needed to get started.

How fast is Gemini 3.5 Flash compared to other models?

Google reports Gemini 3.5 Flash runs at 289 tokens per second and is approximately four times faster than comparable frontier models. It outperforms Gemini 3.1 Pro on coding and agentic benchmarks while running faster and costing less. On Terminal-Bench 2.1, it scores 76.2%, signaling strong capability for tool-use and long-horizon reasoning tasks.

What is the context window for Gemini 3.5 Flash?

Gemini 3.5 Flash supports a 2 million token context window — one of the largest available from any model in 2026. This means you can process entire codebases, long research documents, or hours of transcribed audio in a single prompt without truncation. This 2M context is available on the free tier, not just paid plans.

How does Gemini 3.5 Flash compare to Claude Sonnet 4.6?

Gemini 3.5 Flash leads Claude Sonnet 4.6 on MCP Atlas (83.6% vs 79.1%) and costs roughly half as much via API ($1.50 vs $3.00 per million input tokens). Claude Sonnet 4.6 leads on SWE-Bench Verified software engineering tasks (79.6%). For general agentic work and cost efficiency, Gemini 3.5 Flash wins; for deep coding tasks, Claude holds an edge.

Does Gemini 3.5 Flash replace Gemini 3.1 Pro?

In practical terms, yes. Gemini 3.5 Flash outperforms Gemini 3.1 Pro on coding and long-horizon agentic benchmarks while being faster and cheaper. Google has positioned it as the new performance baseline for the 3.5 family. Gemini 3.5 Pro is still in testing and expected as a paid-tier model in June 2026 for even more demanding tasks.

What is the API pricing for Gemini 3.5 Flash?

Via the Google AI API, Gemini 3.5 Flash is priced at $1.50 per million input tokens — roughly half the cost of Claude Sonnet 4.6 at $3.00 per million. The free tier in Google AI Studio gives developers 60 requests per minute at no cost, making it one of the most accessible options for building and prototyping AI applications in 2026.

Compare All Free AI Models → Track Free Tier Changes

🔔 Get Free AI Alerts First

When a model goes free, a paywall drops, or a deal appears — you'll know before everyone else. No spam, just signal.