【Bitu】Google’s official blog just announced the new generation AI model Gemini 3 Flash, with two main focuses: speed + affordability. This time, it’s truly different—outperforming the previous version 2.5 Pro across multiple dimensions.
Let’s look at the performance numbers. GPQA Diamond benchmark score reaches 90.4%, and Humanity’s Last Exam scores 33.7% in no-tool mode. The key is that it’s much faster and cheaper, even at the lowest thinking level, it outperforms the old version.
Pricing is very attractive: $0.50 per million input tokens, $3 per million output tokens (audio input $1 per million). But that’s not all. Google also introduced context caching (which can save up to 90% of costs) and Batch API (reducing costs by 50% and increasing speed), which can be combined for those interested.
The feature experience has been upgraded. Visual and spatial reasoning abilities are stronger, and code execution is more stable. It can be used for tasks like image scaling, counting, and image editing. Moreover, Gemini 3 Flash has been integrated into Google AI Studio, Antigravity, Gemini CLI, Android Studio, and Vertex AI, so developers can access it now.
API and Vertex AI are now open for access. If you want to try it, now’s the time.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
23 Likes
Reward
23
6
Repost
Share
Comment
0/400
CryptoComedian
· 4h ago
It's really amazing how cheap it is, Google's pricing strategy is telling other manufacturers "we're here to disrupt you"
Can costs be reduced by 90%? I laughed and then cried, now other models really have to drop prices and rug pull
tokens are only fifty cents for a million, I feel like my API quota suddenly came alive
But then I thought, being this cheap, Google will definitely find a way to play people for suckers from other areas
One word to describe it: competitive
View OriginalReply0
OnchainGossiper
· 10h ago
Finally waited for the low price AI, Google really nailed it this time.
I need to calculate this cost, how much cheaper is it compared to before.
Can caching save 90%? That's exaggerated, I won't believe it unless I test it myself.
Everyone is getting into the game, the price war for large models has just begun.
View OriginalReply0
BlockBargainHunter
· 12-18 09:39
Quick and cheap, that's it. Google really went all out this time, $0.5 for a million tokens, hilarious.
---
Can cache save 90%? We need to start using this quickly, or we'll lose a lot.
---
Once again surpassing the previous generation, Google really has pushed the cost-performance ratio to the floor. Can't keep up anymore, everyone.
---
Wait, Batch API, are you trying to get us to run data in batches? Can the cost be reduced again?
---
A score of 90.4% looks good, but I don't know how it performs in actual use. Paper data is always the least trustworthy.
---
I just want to know when this thing will be available domestically. It can't be another VPN, right?
View OriginalReply0
CryptoNomics
· 12-18 09:35
ngl the 90% cost reduction via context caching is statistically significant but everyone's sleeping on the tokenomics implications here... if we model this as a stochastic process of ai inference pricing, you're looking at a potential market inefficiency that could take months to correct
Reply0
FreeRider
· 12-18 09:32
This price is really amazing, much cheaper than before.
Wait, can caching save 90%? Isn't this going to be revolutionary?
The name Flash is quite fitting, it's fast.
I'm a bit worried about whether performance will decrease, but the numbers look pretty solid.
Google finally did some staffing this time.
View OriginalReply0
EternalMiner
· 12-18 09:19
Damn, with this price, 2.5 Pro can retire
So cheap it's ridiculous, no wonder there's another round of competition
Wait, can context caching really save 90%? How intense does that have to be?
Fast speed and low cost, who else would use anything else...
Now I have to revise my prompt engineering again
Google Gemini 3 Flash Release: Ultra-Low-Cost AI Model, API Now Open for Access
【Bitu】Google’s official blog just announced the new generation AI model Gemini 3 Flash, with two main focuses: speed + affordability. This time, it’s truly different—outperforming the previous version 2.5 Pro across multiple dimensions.
Let’s look at the performance numbers. GPQA Diamond benchmark score reaches 90.4%, and Humanity’s Last Exam scores 33.7% in no-tool mode. The key is that it’s much faster and cheaper, even at the lowest thinking level, it outperforms the old version.
Pricing is very attractive: $0.50 per million input tokens, $3 per million output tokens (audio input $1 per million). But that’s not all. Google also introduced context caching (which can save up to 90% of costs) and Batch API (reducing costs by 50% and increasing speed), which can be combined for those interested.
The feature experience has been upgraded. Visual and spatial reasoning abilities are stronger, and code execution is more stable. It can be used for tasks like image scaling, counting, and image editing. Moreover, Gemini 3 Flash has been integrated into Google AI Studio, Antigravity, Gemini CLI, Android Studio, and Vertex AI, so developers can access it now.
API and Vertex AI are now open for access. If you want to try it, now’s the time.