💻 Technology Live

Google’s Android coding tests reveal an unexpected Gemini 3.5 Flash weakness

Affiliate links on Android Authority may earn us a commission. Learn more. Google has just refreshed its Android Bench rankings, and the results present developers with a puzzling picture. Google’s new Gemini 3.5 Flash is actively falling behind its predecessor while charging yo

Android Authority

15 Jun 2026 5 hours ago 2 min read

Google’s Android coding tests reveal an unexpected Gemini 3.5 Flash weakness

Android Authority — 15 June 2026

Text:

3 0 0

🎙️ AI Podcast — Two-Host Discussion

Google’s Android coding tests reveal an unexpected Gemini 3.5 Flash weakness

Kokoro TTS · ~5 min episode · American English voices

Choose voices for Host A and Host B. Changes take effect on next play.

Host A 🟥

Host B 🟦

Affiliate links on Android Authority may earn us a commission. Learn more.

Google has just refreshed its Android Bench rankings, and the results present developers with a puzzling picture. Google’s new Gemini 3.5 Flash is actively falling behind its predecessor while charging you three times the price to use it.

The latest Android coding leaderboard , a benchmark that evaluates how well different AI models can perform Android development tasks, introduced Gemini 3.5 Flash for the first time, but the newcomer didn’t make it into the top five. Topping the list was OpenAI’s GPT 5.5 , which scored 74, followed by GPT 5.4 and an older Google model, Gemini 3.1 Pro Preview, both with 72.4. The new Claude Opus models also outperformed the Flash variant.

Gemini 3.5 Flash scored 63.7, placing sixth overall. What was more surprising, though, was its efficiency. The model averaged 355.9 total tokens, a big jump compared to other systems, according to Google’s benchmark data. That came to an average cost of $147.1, making it the most expensive model on the entire list even with slower performance than a number of rivals.

For context, Google’s Flash branding has always been about speed and cheaper prices. At Google I/O 2026, the company announced the most powerful Flash model it had ever built , Gemini 3.5 Flash, which it claimed had more robust coding capabilities and better support for AI agents and complex workflows. Google also said the model outperformed Gemini 3.1 Pro in a number of internal benchmarks and produced output up to four times faster than competing frontier models.

However, the Android benchmark tells a different story. Gemini 3.5 Flash might shine in the broader evaluations of agentic and coding tasks run by Google, but its performance on actual Android development tasks seems less than stellar. For example, Gemini 3.1 Pro Preview delivered a significantly better score while costing about one-third as much, as noted by 9to5Google .

The bigger question now is whether Google can improve Gemini 3.5 Flash with updates or whether the upcoming Gemini 3.5 Pro will better deliver on the company’s performance promises. For now, Google’s own numbers suggest that newer isn’t always better.

Thank you for being part of our community. Read our Comment Policy before posting.