The Hidden Environmental Cost of Using AI Tools and What to Do About It

The energy cost of AI inference is rarely discussed in business AI adoption conversations, but it is material and growing. Training large AI models makes headlines for its energy consumption, but the ongoing cost of inference — running the model to generate each response, at billions of queries per day across all users — is the larger and more persistent environmental impact. For businesses making AI adoption decisions, understanding this cost does not mean avoiding AI, but it does mean making choices that acknowledge the environmental trade-offs rather than pretending they do not exist.

Why AI Uses So Much Energy

AI language models perform billions of mathematical operations — matrix multiplications across billions of parameters — to generate each token of output. A single response from GPT-4o involves trillions of floating-point operations running on specialised hardware (GPUs and TPUs) that are among the most energy-intensive computing components manufactured. Data centres running AI models at scale require significant electricity, often complemented by water cooling systems that consume additional water resources.

Estimates of the energy cost per AI query vary widely and are difficult to pin down precisely because providers do not disclose this data. Research suggests a simple ChatGPT query uses approximately ten times the energy of a Google search. At billions of daily queries across the industry, the cumulative impact is significant.

AI vs Traditional Computing: Approximate Energy Comparison

Action	Approx. Energy Use
Google search	~0.0003 Wh
Simple ChatGPT query	~0.001–0.01 Wh (est.)
AI image generation	~0.01–0.05 Wh (est.)
Sending an email	~0.00004 Wh

Note: Estimates vary significantly; exact figures are not publicly disclosed by providers.

What Businesses Can Do

Right-size model selection. Using a smaller, more efficient model for tasks where it performs adequately is both a cost and an environmental optimisation. A task handled by GPT-4o Mini instead of GPT-4o uses significantly less compute and energy for equivalent output quality. The same discipline that reduces API costs also reduces environmental impact.

Eliminate wasteful queries. Debugging workflows that generate multiple AI calls where one would suffice, removing redundant AI processing steps, and reducing unnecessary regeneration of outputs that could be cached — all of these reduce compute consumption and environmental impact alongside API costs.

Choose providers with renewable energy commitments. Major AI providers have varying commitments to renewable energy for their data centres. Microsoft, Google, and Anthropic all have significant renewable energy commitments; evaluating providers partly on their environmental commitments is a reasonable criterion for businesses with sustainability goals.

Keeping It in Perspective

The environmental impact of business AI use is real but should be kept in proportion. For most small businesses, the energy footprint of AI tool use is smaller than business travel, office heating and cooling, or commuting by a significant margin. The goal is not to avoid AI because of environmental concerns — the productivity benefits are real and significant — but to make efficient choices that capture those benefits without unnecessary waste.

Measuring Your Business’s AI Carbon Footprint

Measuring AI’s contribution to your business’s carbon footprint is genuinely difficult because providers do not publish granular energy consumption data per API call. What is available: rough estimates from academic research, provider sustainability reports that cover their data centres at an aggregate level, and third-party calculators that apply published estimates to your API usage data. These tools produce approximate numbers rather than precise measurements, but approximate numbers are sufficient to identify your highest-impact AI uses and make proportionate optimisation decisions.

For businesses with sustainability reporting requirements — particularly those subject to Scope 3 emissions tracking — document your approach and the limitations of available data. Regulators and auditors generally accept good-faith estimation with documented methodology when precise measurement is not technically feasible, which is currently the case for AI API emissions. As providers improve their sustainability disclosure (a trend that is accelerating under regulatory pressure), more precise measurement will become possible.

Provider-Level Sustainability Decisions

AI providers vary in their renewable energy commitments and their transparency about sustainability. Microsoft has committed to carbon negativity by 2030 and powers its Azure data centres with a growing proportion of renewable energy. Google and Anthropic have similar ambitious commitments. When choosing between providers with comparable capability and pricing, sustainability credentials are a legitimate secondary criterion for businesses with environmental commitments. Request sustainability information directly from providers if their published disclosures are insufficient for your reporting requirements — many will provide additional detail on request for enterprise customers.

On-premise and edge AI deployment using small language models has a different emissions profile than cloud AI, with energy costs falling on your own infrastructure rather than shared cloud data centres. For businesses with renewable energy powering their facilities, local model deployment can have a lower carbon footprint than cloud API calls — though this requires careful analysis of your specific energy mix and the model’s energy consumption relative to the cloud alternative.

Right-Sizing as an Environmental Practice

Model right-sizing — using the smallest model that adequately performs a task — is simultaneously a cost optimisation and an environmental one. A classification task that runs on GPT-4o Mini rather than GPT-4o uses approximately one-tenth the compute and energy for equivalent quality output. Across a high-volume classification workflow running tens of thousands of requests per day, the environmental difference is material. Building right-sizing into your model selection process — starting with small models and only escalating to larger ones when quality testing demonstrates the need — is both economically and environmentally sound practice.

Audit your three highest-volume AI workflows this quarter against the right-sizing principle. For each, test whether a smaller model meets your quality threshold. The cost saving and the environmental saving are both meaningful at production volume.

AI and Your Organisation’s Sustainability Commitments

For organisations with formal sustainability commitments — net zero targets, science-based targets, Scope 3 emissions reporting — AI infrastructure is increasingly a relevant consideration. The energy consumption of AI model inference contributes to Scope 3 emissions through cloud services, and as AI usage scales, this contribution grows proportionally. Build AI environmental considerations into your sustainability reporting framework: document your AI usage, apply published emissions estimation methodologies, and include AI infrastructure in your Scope 3 analysis. This is not currently required for most reporting frameworks, but it is consistent with the direction of travel as AI infrastructure disclosure requirements develop.

Efficiency as Environmental Practice

The most impactful environmental action available for AI usage is the same as the most impactful cost action: efficiency. Right-sizing models, compressing prompts, implementing caching, using batch processing — each of these reduces energy consumption per unit of AI work in the same way and proportion that it reduces cost. A team that implements AI FinOps discipline is also implementing environmental efficiency improvement, because the two are directly aligned. There is no tension between operating AI efficiently from a cost perspective and operating it responsibly from an environmental perspective — they are the same thing.

The environmental cost of AI usage is real but manageable through the same efficiency practices that make AI economically sustainable. Right-size your models, compress your prompts, implement caching, and choose providers with renewable energy commitments — and your AI practice will be both cost-efficient and environmentally responsible.

AI and Scope 3 Emissions Reporting

Scope 3 emissions reporting covers indirect emissions from an organisation’s value chain, including the use of purchased services. For organisations subject to mandatory Scope 3 reporting (large corporations in the EU under CSRD, and increasingly others), AI API usage may need to be included in purchased services emissions calculations. The data required: monthly API call volumes by provider, per-call energy estimates (available from some providers on request), and the grid carbon intensity of the relevant data centre regions. Most AI providers do not yet publish per-API-call emissions data proactively, but several provide carbon reporting tools or respond to formal emissions data requests. For organisations building Scope 3 AI emissions into their reporting, establishing data collection processes now — before reporting requirements mature — avoids the scramble of retroactive data gathering.

Energy Efficiency Improvements in AI Infrastructure

The AI industry has made significant energy efficiency progress through architectural improvements, hardware optimisation, and inference efficiency research. Nvidia’s H100 and H200 GPUs deliver substantially more AI compute per watt than their predecessors. Quantisation techniques that reduce model precision from 16-bit to 8-bit or 4-bit cut energy consumption roughly in half with modest quality trade-offs on many tasks. Speculative decoding and other inference optimisations reduce the compute required per token significantly. These improvements mean that the energy cost per AI capability unit is declining even as total AI energy consumption grows. For organisations tracking AI energy efficiency over time, measuring energy per useful output (tokens per kilowatt-hour) rather than total energy consumption shows the improving efficiency trend more accurately than absolute consumption figures.