Google Reveals 'Implicit Caching' to Reduce Costs of Accessing Latest AI Models

Google introduces 'Implicit Caching,' a cost-effective solution designed to efficiently streamline access to its advanced AI models.

May 9, 2025 11:20 AM

2 MIN TO READ

Adam Brown

Google Reveals 'Implicit Caching' to Reduce Costs of Accessing Latest AI Models

Image Credits:

Google

G

oogle is revealing a new feature in its Gemini API that the company claims will make its latest AImodels cheaper for third-party developers. Google also calls

this new feature “implicit caching.”, saying that it can deliver 75% savings on “repetitive context” passed to models via the Gemini API. This also supports Google Gemini’s 2.5 Pro and 2.5 Flash models.

One of the most common practices in the AI sector minimizes the computing demands and costs by reusing frequently accessed or pre-processed model data. Think about how caches can store responses to commonly asked questions, preventing the model from generating the same answers repeatedly.

Beforehand, Google offered only explicit prompt caching, requiring developers to define manually their best prompts. While it promised cost savings, the manual effort was mostly required.

Some developers expressed dissatisfaction with how explicit caching functioned for Gemini 2.5 Pro, mentioning that it led to unexpectedly high API bills. The criticism intensified over the past week, prompting the Gemini team to apologize and commit to making improvements.

Differing from explicit coaching, implicit coaching happens automatically. This is also enabled by Gemini 2.5 models, and it can pass on cost savings if the Gemini API needs to model hits a cache.

“[W]hen you send a request to one of the Gemini 2.5 models, if the request shares a common prefix as one of the previous requests, then it’s eligible for a cache hit,” mentioned Google in a post, “We will dynamically pass cost savings back to you.”.

It is also worth mentioning that the minimum prompt token count for implicit caching is 1.024 for 2.5 Flash and 2.048 for 2.5 Pro, as mentioned by Google’s developer documentation. Which is not represented by an enormous amount, leading to it not taking too much to trigger these automatic savings. Tokens that use raw bits of data models also work with thousands of tokens, which are equivalent to about 750 words.

Become a member to unlock this article
and everything we write.

This post is part of our member-only content. It’s just one of the many stories waiting for you inside.

By joining, you’ll get:

Full access to all exclusive, member-only articles

A distraction-free, ad-free reading experience

Support the authors and ideas you care about

Early access to upcoming content and features

Some developers expressed dissatisfaction with how explicit caching functioned for Gemini 2.5 Pro, mentioning that it led to unexpectedly high API bills. The criticism intensified over the past week, prompting the Gemini team to apologize and commit to making improvements.

Differing from explicit coaching, implicit coaching happens automatically. This is also enabled by Gemini 2.5 models, and it can pass on cost savings if the Gemini API needs to model hits a cache.

“[W]hen you send a request to one of the Gemini 2.5 models, if the request shares a common prefix as one of the previous requests, then it’s eligible for a cache hit,” mentioned Google in a post, “We will dynamically pass cost savings back to you.”.

It is also worth mentioning that the minimum prompt token count for implicit caching is 1.024 for 2.5 Flash and 2.048 for 2.5 Pro, as mentioned by Google’s developer documentation. Which is not represented by an enormous amount, leading to it not taking too much to trigger these automatic savings. Tokens that use raw bits of data models also work with thousands of tokens, which are equivalent to about 750 words.

By

Adam Brown

•

May 9, 2025 11:20 AM

Most Read Articles

What is Qdrant?

Microsoft Bans Employees From Using DeepSeek App

CrowdStrike Announces Layoffs Affecting 500 Employees

RELATED ARTICLES

Yale Launches New Google Home Smart Lock, Price Impacted by Tariffs

Digital Innovation Tools

Yale Launches New Google Home Smart Lock, Price Impacted by Tariffs

Yale unveils a new Google Home smart lock, now available with advanced features but at a higher price due to tariffs.

The Best Smart Home Devices for 2025

The World of Hardware

The Best Smart Home Devices for 2025

Discover the best smart home devices for 2025, featuring top tech for security, convenience, and energy efficiency in modern living.

iOS 26 Beta 2 Version Brings Improvements to Apple’s Liquid Glass Interface

Apps & Software

iOS 26 Beta 2 Version Brings Improvements to Apple’s Liquid Glass Interface

iOS 26 Beta 2 enhances Apple’s Liquid Glass interface with better blur effects, faster app launches, and more UI refinements.

SUNRISE TRENDS

5 Trends in EdTech – How is Technology Disrupting the Education Industry?

5 Trends in EdTech – How is Technology Disrupting the Education Industry?

RPA and Business Transformation

RPA and Business Transformation

The same boring high-volume data insertion and repetitive, manual processes alike that could otherwise be handled by robots.

The Best Gadgets Of 2025: A Complete Guide

The Best Gadgets Of 2025: A Complete Guide

Discover the best gadgets of 2025, including smart wearables, cool technologies, and AI devices that change the way we experience everyday activities.

5 Trends in Digital Transformation

5 Trends in Digital Transformation

A lot of people are actively transforming, regardless of whether or not they know the term.

The Cost of Cybercrime: How Businesses are Losing Millions Due to Cyber Attacks

The Cost of Cybercrime: How Businesses are Losing Millions Due to Cyber Attacks

Explore the financial cost of cyber attacks, including direct expenses for recovery and indirect losses from lost revenue and damaged reputation.

Read More About Sunrise Trends

THIS WEEK

Meta And Oakley Team Up To Launch Smart Glasses

Meta and Oakley launch smart glasses for athletes, combining the advanced technology with AI features all within a more durable and sport-ready design.

Apps & Software

How Tesla Plans to Remotely Control Its Robotaxis — and the Challenges It Faces

Explore how Tesla intends to remotely operate its robotaxis, including the technology involved and the key limitations it may face.

Apps & Software

Best Tablets to buy in 2025

Discover the best tablet to buy in 2025 for work, study, creativity, and more- here are the top picks for every budget.

Organisers of Australian Teen Social Media Ban Trial Say the Technology Works

Organisers of Australia’s teen social media ban trial report success, claiming the software effectively restricts underage access to online platforms.

DIGITAL INSIGHTS

Are AI And Robotics Creating A World Without Jobs?

Artificial Intelligence

Are AI And Robotics Creating A World Without Jobs?

As artificial intelligence (AI) and robotics continue to advance, there is growing concern that these technologies will lead to a world without jobs.

What is Qdrant?

Artificial Intelligence

What is Qdrant?

Learn the power of an open-source database, and discover how Qdrant is able to power different AI apps like anomaly detection, semantic search, or RAG chatbots.

Ultimate Glossary: 100 AI Terms You Need to Know in 2025

Artificial Intelligence

Ultimate Glossary: 100 AI Terms You Need to Know in 2025

Discover the ultimate glossary with artificial intelligence terms that will help you have a better understanding of the AI industry.

How to Build a Personal AI Assistant in 60 Minutes or Less

Artificial Intelligence

How to Build a Personal AI Assistant in 60 Minutes or Less

Learn how to build a personal AI assistant in under 60 minutes using no-code or low-code tools, and little to no coding experience.

Featured Articles

Apps & Software

3D Audio and Displays in Advertising: Engaging Customers in a New Way

Investment and Funding

Aspora Secures $50M from Sequoia to Develop Remittance and Banking Solutions for Indian Diaspora

How to Build Your Startup

Overcoming the Fear of Failure in a Startup

Apps & Software

Samsung to Host Unpacked Event in Early July

The World of Hardware

The Best Smart Home Devices for 2025

Apps & Software

iOS 26 Beta 2 Version Brings Improvements to Apple’s Liquid Glass Interface

Digital Innovation Tools

Yale Launches New Google Home Smart Lock, Price Impacted by Tariffs

Read More Articles