Gemini 2.0 Flash Is Gone: What Free API Users Need to Do Now

If you built something on the Gemini API in the past 18 months, there is a reasonable chance it stopped working yesterday. Google executed a long-announced but frequently ignored deprecation: four Gemini 2.0 Flash model IDs officially reached their end-of-life date on June 1, 2026. The shutdown affects apps built on Google AI Studio's free tier, Firebase AI Logic integrations, and any production system still hardcoding the 2.0 Flash string. This guide covers exactly which models were cut, what the free-tier rate limit changes look like, and how to migrate -- including a heads-up on the next cutover that is already on Google's calendar.

Developer looking at code on a laptop screen representing the migration work required after Gemini 2.0 Flash API shutdown — Apps still calling Gemini 2.0 Flash endpoints after June 1, 2026 receive API errors. Migrating to 2.5 Flash takes minutes -- but another deadline is already set for October. Photo: Unsplash

What exactly did Google shut down on June 1, 2026?

Google retired four specific model identifiers simultaneously. According to the AI Weekly shutdown alert published June 1, the discontinued IDs are:

gemini-2.0-flash -- the auto-aliased shorthand
gemini-2.0-flash-001 -- the pinned version released February 5, 2025
gemini-2.0-flash-lite -- the lightweight variant shorthand
gemini-2.0-flash-lite-001 -- the pinned Lite version released February 25, 2025

Google's official Cloud documentation now states plainly: "As of June 1, 2026, gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 are discontinued and are no longer available." Firebase AI Logic carried the same warning for several weeks before the shutdown, noting that all Imagen models face a separate shutdown on June 24, 2026.

One nuance worth noting: Google's deprecation page technically stated that June 1 was "the earliest possible date on which a model might be retired," not a guaranteed hard cutover. In practice, the shutdown appears to have been enforced on schedule, and developers reporting broken integrations confirm the errors began June 1. Treat it as hard.

Which migration target should free tier users pick?

The official replacement mapping is straightforward, though some third-party coverage incorrectly pointed developers at Gemini 3.5 Flash. Google's migration guidance as of June 2026:

Deprecated Model ID	Correct Migration Target
`gemini-2.0-flash`	`gemini-2.5-flash`
`gemini-2.0-flash-001`	`gemini-2.5-flash`
`gemini-2.0-flash-lite`	`gemini-2.5-flash-lite`
`gemini-2.0-flash-lite-001`	`gemini-2.5-flash-lite`

Do not jump to gemini-3.5-flash yet unless you have specifically benchmarked it for your use case. The 3.5 generation is newer and still evolving. The 2.5 Flash family is Google's current stable recommendation for production Flash workloads and the intended landing zone for 2.0 migrations.

For developers on Google AI Studio's free tier, the migration is a one-line change in most codebases: replace the deprecated model string, test your prompts, and redeploy. Both 2.5 Flash and 2.5 Flash-Lite remain available on the free tier with no credit card required.

Not sure if your app is affected? Search your entire codebase -- including environment variable files, infrastructure configurations, and CI scripts -- for any of the four deprecated strings listed above. They can appear in places that aren't obvious, especially in tools or libraries that cache model names.

How do the free tier rate limits compare between 2.0 and 2.5 Flash?

This is where free-tier developers need to pay close attention. The migration is not rate-limit-neutral. According to the PEC Collective Gemini free tier guide, the limits changed significantly between generations:

Model	RPM (free)	TPM (free)	RPD (free)
Gemini 2.0 Flash (retired)	15	1,000,000	1,500
Gemini 2.5 Flash (current)	10	250,000	1,500
Gemini 2.5 Flash-Lite (current)	15	250,000	1,000

The big number: Gemini 2.0 Flash offered 1 million tokens per minute on the free tier. Gemini 2.5 Flash drops that to 250,000 tokens per minute -- a 75% reduction. For most lightweight projects this will not matter. For apps doing long-document processing or bulk summarization on the free tier, this is a meaningful constraint. The daily request limit (1,500 RPD) stays the same for 2.5 Flash, but 2.5 Flash-Lite drops to 1,000 RPD.

On the performance side, the migration is a genuine upgrade. Artificial Analysis benchmarks show Gemini 2.5 Flash generating output at 224.6 tokens per second through Google's API, well above average for non-reasoning models in its price tier. Gemini 2.5 Flash-Lite is even faster at 392.8 tokens per second with just 0.29 seconds to first token. You are trading some free-tier TPM headroom for a meaningfully better model.

Also worth knowing: the data-sharing clause has not changed. Free-tier usage on Google AI Studio may be used by Google to improve their models. If you are handling sensitive user data, the free tier is not appropriate regardless of which Flash model you use -- consider Vertex AI's $300 new-account credit or a paid API tier instead. You can compare your options on our AI free tier tracker.

Abstract representation of data flowing through a network, illustrating the API migration path from Gemini 2.0 Flash to Gemini 2.5 Flash — The free-tier TPM drops from 1 million to 250,000 in the 2.0-to-2.5 migration -- check whether your app is TPM-constrained before migrating. Photo: Unsplash

Why is Google moving so fast on model retirement?

Google's accelerating deprecation cadence is the most important long-term signal in the shutdown story. The Gemini 2.0 Flash models launched in February 2025 and were retired in June 2026 -- a product lifespan of roughly 16 months. The 2.5 Flash models already have a shutdown date stamped on the calendar: October 16, 2026, approximately 4.5 months from now. Their planned replacement is gemini-3.5-flash.

This means developers who migrate to 2.5 Flash today will need to migrate again before the end of 2026. That is not a criticism of Google's model quality -- the newer models are genuinely better -- but it is an architectural reality that teams building on the Gemini API need to plan for.

Best practice going forward: Abstract your model selection behind a configuration layer. Store the model ID in an environment variable or config file, not hardcoded in your API calls. When Google announces a deprecation, you can update one config value across all your environments rather than hunting through your codebase.

The pattern here is consistent with what other major providers are doing. AI models are being iterated faster than traditional software dependencies, and the industry has not yet converged on a standard deprecation convention. Google's "earliest possible date" language adds ambiguity, but the practical answer is to treat announced shutdown dates as hard deadlines. For developers tracking these changes, our free AI news section covers deprecation and pricing announcements as they happen.

What about the rest of the Gemini 2.0 model family?

The June 1 shutdown specifically targeted the Flash and Flash-Lite variants. Other Gemini 2.0 models are on different schedules. Gemini 2.0 Flash Thinking and experimental variants had separate timelines, and Gemini 1.5 Pro remains available on the free tier at 2 requests per minute and 32,000 token context. If your application relies on Gemini 2.0 Flash specifically for its 1 million token per minute free throughput, the 1.5 Flash model (also still available at 15 RPM, 1M TPM, 1,500 RPD on the free tier) may be a better interim solution while you evaluate 2.5 Flash's lower TPM ceiling against your actual usage.

The Firebase AI Logic team also flagged a separate deadline: all Imagen models on Firebase are scheduled to shut down on June 24, 2026. If you are using Firebase AI Logic for image generation, that migration is also pending. Check the Firebase AI Logic models page for the current status.

For context on how Google's Gemini free tier compares to the free tiers offered by OpenAI and Anthropic, see our free vs paid AI comparison. Spoiler: Google remains the most generous provider for developers who need a no-cost API key to prototype with, even with the TPM reduction in 2.5 Flash.

🔑 Key Takeaways

Google shut down all Gemini 2.0 Flash model IDs on June 1, 2026, meaning any app still calling these endpoints will receive API errors starting now.
The correct migration targets are gemini-2.5-flash and gemini-2.5-flash-lite -- not gemini-3.5-flash, which some coverage incorrectly suggested.
Free-tier users face a significant rate limit change: TPM drops from 1 million (2.0 Flash) to 250,000 (2.5 Flash), though the daily request cap of 1,500 stays the same.
Gemini 2.5 Flash already has its own deprecation date of October 16, 2026, so the current migration is not a final one -- another cutover to gemini-3.5-flash is coming in roughly 4.5 months.
The best long-term defense against repeated migration pain is abstracting your model ID into a config or environment variable so future deprecations require a one-line update, not a codebase search.

Related Resources

In-depth reviews of AI tools See how the tools behind the headlines actually perform.
AI tools by profession and use case Find the right tool for what you actually do.
AI scam prevention and alerts Stay safe while exploring new AI tools.

Frequently Asked Questions

What happened to Gemini 2.0 Flash?

Google shut down four Gemini 2.0 Flash model IDs on June 1, 2026: gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, and gemini-2.0-flash-lite-001. Any app still hardcoded to these model strings will receive API errors. Google recommends migrating to gemini-2.5-flash or gemini-2.5-flash-lite immediately.

What should I migrate to from Gemini 2.0 Flash?

Google's official recommendation is to migrate gemini-2.0-flash and gemini-2.0-flash-001 to gemini-2.5-flash, and gemini-2.0-flash-lite and gemini-2.0-flash-lite-001 to gemini-2.5-flash-lite. Do not migrate directly to gemini-3.5-flash -- some coverage incorrectly recommended this, but 2.5 Flash is the supported migration target for the June 2026 cutover.

Does Gemini 2.5 Flash still have a free tier?

Yes. Google AI Studio continues to offer Gemini 2.5 Flash on the free tier with no credit card required. Free limits are 10 requests per minute, 250,000 tokens per minute, and 1,500 requests per day. This is a significant drop from Gemini 2.0 Flash's 1 million TPM, so high-volume free-tier apps should test their actual usage before assuming the migration is seamless.

When will Gemini 2.5 Flash be deprecated?

Gemini 2.5 Flash already has a scheduled shutdown date of October 16, 2026 -- just 4.5 months after the June 2026 migration. The planned replacement is gemini-3.5-flash. Developers should plan for a second mandatory migration before the end of 2026 and avoid hardcoding the 2.5 Flash model string.

How do I check if my app is using a deprecated Gemini model?

Search your codebase for any of these strings: gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, gemini-2.0-flash-lite-001. Check source files, environment variable configs, infrastructure definitions, and any CI/CD scripts. Replace each occurrence with gemini-2.5-flash or gemini-2.5-flash-lite depending on which variant you were using.

Track All Free Tier Changes → Compare Free vs Paid AI

What exactly did Google shut down on June 1, 2026?

Which migration target should free tier users pick?

How do the free tier rate limits compare between 2.0 and 2.5 Flash?

Why is Google moving so fast on model retirement?

What about the rest of the Gemini 2.0 model family?

🔑 Key Takeaways

Related Resources

Frequently Asked Questions

What happened to Gemini 2.0 Flash?

What should I migrate to from Gemini 2.0 Flash?

Does Gemini 2.5 Flash still have a free tier?

When will Gemini 2.5 Flash be deprecated?

How do I check if my app is using a deprecated Gemini model?

🔔 Get Free AI Alerts First

Related Resources