A single line item labelled “AI subscriptions” in your P&L tells you almost nothing useful. It does not tell you which team is driving the spend, which workflows generate the most cost relative to the value they deliver, which use cases have the best ROI, or where optimisation effort would have the greatest impact. Tracking AI spend at the team, project, and use case level transforms AI costs from a black box into a manageable, optimisable line of your operations. Here is how to implement it.
Why Granular Tracking Matters
Aggregate AI spend obscures the information needed to manage it effectively. When a monthly AI API bill grows by 40%, aggregate tracking tells you that it grew — but not whether it was driven by one team’s new workflow, a volume spike on a specific use case, or gradual inflation across many small workflows. Granular tracking immediately identifies the driver and enables a targeted response rather than a broad cost-cutting exercise that may cut valuable workflows alongside wasteful ones.
Granular tracking also creates accountability. Teams that know their AI spend is visible and attributed to their budget are more deliberate about the workflows they build and the models they use. Invisible costs generate invisible inefficiency.
Tagging API Calls for Attribution
The foundation of granular AI spend tracking is tagging: annotating every API call with metadata that identifies its origin. Most AI gateway tools (Helicone, Portkey, LangSmith) allow custom properties to be attached to each request. The standard set: team or department name, project or workflow name, environment (production vs development), and user identifier. Adding these tags requires a small code change — passing an additional header or metadata field — but unlocks all the aggregation and filtering capabilities of your observability platform.
Tagging Taxonomy for AI Spend Attribution
| Tag | Example Values | What It Enables |
|---|---|---|
| team | sales, marketing, ops, support | Per-team budget tracking |
| workflow | lead-enrichment, email-drafting | Workflow cost comparison |
| env | production, development, test | Separate dev from prod spend |
| user_id | user identifiers | Per-user usage patterns |
Using Helicone for Team-Level Tracking
Helicone’s custom properties feature is designed precisely for this use case. Pass a Helicone-Property-Team header with each request, and the Helicone dashboard immediately lets you filter and aggregate spend, token usage, and latency by team. Add a Helicone-Property-Workflow header and you can compare cost per call and cost per output across all your AI workflows. The implementation is a header addition — a few lines of code per integration point.
Creating a Weekly AI Spend Report
Once tagging is in place, a weekly AI spend report becomes trivial to generate. Most observability platforms export data via API; a simple script or a Zapier automation can pull the week’s tagged spend data and format it as a report for your management team. The report should show: total spend by team vs prior week, top five workflows by cost, any workflows where cost-per-call has increased significantly, and any development or test traffic appearing in production cost figures. This fifteen-minute weekly report surfaces every meaningful cost signal without requiring anyone to manually dig through dashboards.
Setting Up Per-Team Tagging
Adding tags to API calls requires one code change per integration point: pass an additional header or metadata field containing the team name, workflow name, and environment. In Helicone, this is done via HTTP headers (Helicone-Property-Team: marketing). In Portkey, it is a metadata field in the request config. In a direct API integration, you include it as a custom field in the request metadata that your logging layer captures. The convention matters less than the consistency — establish your taxonomy (team names, workflow names, environment values) before tagging, so reporting is coherent from day one rather than requiring cleanup later.
For teams using multiple AI tools beyond the primary API — ChatGPT subscriptions, Claude Pro seats, Midjourney, Perplexity Pro — tagging API calls captures the programmatic usage but not the subscription costs. Maintain a separate subscription inventory that records each tool, its cost, and which team or function it serves. The complete AI cost picture combines tagged API spend with the subscription inventory. Neither alone is sufficient for accurate per-team cost allocation.
Turning Visibility Into Action
Spend visibility is only valuable if it drives decisions. The most productive use of per-team AI spend data is in monthly operations reviews: five minutes reviewing which teams are spending what, which workflows drove the most cost, and whether any anomalies need investigation. This review does not need to be adversarial — the goal is understanding, not policing. When a team’s spend spikes, the first question is whether it reflects valuable new usage or an inefficiency to address. Most spikes are valuable, and visibility confirms that the investment is worthwhile. The occasional spike that reflects a misconfigured workflow or an unnecessarily expensive model choice is worth the five minutes it takes to identify and fix.
Over time, per-team spend data enables more accurate AI cost budgeting. Teams that see their historical spend can forecast their needs for the next quarter with reasonable confidence, rather than the entire organisation operating on a single undifferentiated AI budget that bears no relationship to actual usage patterns.
Set up per-team tagging on your highest-volume API integration this week. The first week of data will reveal cost patterns you were not previously aware of.
From Cost Tracking to Cost Culture
Spending visibility alone does not change behaviour — it needs to be embedded in a culture where teams understand that AI costs are their responsibility rather than a centralised overhead. Building this culture does not require mandates or policing; it requires communication and context. When you share the monthly AI cost report with team leads, include benchmarks — what a “reasonable” cost per workflow run looks like, what good cost efficiency looks like for their workflow type. Give teams the context to evaluate their own numbers rather than just seeing raw figures. Teams that understand their AI costs in context of what is expected make better decisions than those that see numbers without reference points.
Integrating AI Costs Into Product Economics
For teams building AI-powered products or features, attributing AI costs to specific product features rather than to engineering teams enables product-level economics tracking. The cost of running the AI that powers a specific feature — tagged in your observability platform with the feature name — can be tracked alongside that feature’s revenue contribution, support cost, and development cost as part of a complete unit economics view. This product-level cost attribution is what makes AI investment decisions genuinely economic rather than technological: you are evaluating whether a specific AI capability is worth its marginal cost, not just whether AI in general is worth investing in.
Set up team-level tagging on your most active API workflows this week. The first week of tagged data will immediately show you where your AI spend is concentrated and which workflows have the highest cost relative to their business impact — the starting point for every subsequent optimisation decision.
Benchmarking AI Spend Against Business Value
The most useful AI cost metric is not spend per team or spend per workflow in isolation — it is spend relative to the business value the AI capability generates. An AI workflow that costs $500/month and saves a full-time employee’s worth of work is exceptional value. One that costs $50/month and saves thirty minutes per week has a reasonable but not exceptional return. One that costs $200/month and is difficult to connect to any measurable business outcome warrants close scrutiny. Build a simple value ledger alongside your cost tracking: for each significant AI workflow, document the estimated business value generated (time saved, quality improved, revenue enabled) and compare it to the running cost. That value ledger is the foundation of mature AI investment management.
AI Spending Benchmarks for Different Business Sizes
The investment in doing this well — clear scope, honest measurement, iterative improvement — pays back across every subsequent AI deployment that builds on the same foundation.
The cost management practice that produces the best outcomes is the simplest one that actually gets done. A thirty-minute monthly review that consistently happens beats a comprehensive quarterly analysis that gets deprioritised. Start with what you can sustain, and build rigour incrementally as the stakes justify it.
Applied consistently, this approach compounds in value across every subsequent AI workflow your team builds on the same operational foundation.
Applied consistently, this approach compounds in value across every subsequent AI workflow your team builds on the same operational foundation.