AI Compute Demand Growth and Expanding Access to Powerful Resources

When people talk about AI, they often talk about models, features, or “what’s coming next”. In my day-to-day work in marketing automation, I see a quieter constraint that shapes everything you can ship, price, and scale: compute. Not ideas. Not prompts. Not even data, at least not at first. Compute.

That’s why a short message shared by OpenAI in January 2026 landed with such clarity: compute is the scarcest resource in AI, and demand keeps growing. In the related podcast conversation, OpenAI’s CFO Sarah Friar and investor Vinod Khosla spoke with host Andrew Mayne about rising compute demand and ways to bring AI’s benefits to more people.

You and I can’t solve global chip supply or build data centres overnight. Still, we can make practical choices that reduce waste, improve ROI, and help more teams use AI without burning budgets. In this article, I’ll walk you through what “compute scarcity” means in real business terms, why demand keeps climbing, and how you can design AI-powered marketing and sales automations (especially in tools like make.com and n8n) that stay cost-aware and scalable.

What “compute scarcity” actually means (in plain business English)

In AI, “compute” usually means the processing power required to:

train large models (very compute-intensive, typically done by AI labs)
run models for users (inference: every chat, classification, summary, or extraction costs compute)
serve workloads reliably at scale (latency, throughput, peak demand planning)

Scarcity doesn’t always show up as “we ran out”. It shows up as trade-offs:

Higher unit costs for each AI action (a call, a message, a generated email).
Rate limits and batching constraints that force you to redesign workflows.
Capacity planning becoming a board-level topic for vendors and a budget line for customers.
Product choices: smaller models, fewer features, stricter quotas, or more caching.

If you run marketing operations or sales enablement, compute scarcity hits you indirectly. You’ll see it in pricing tiers, token limits, “fair use” policies, and the uncomfortable realisation that an automation that looked cheap in a pilot becomes pricey at 10× volume.

Compute isn’t just a technical detail; it’s a strategy constraint

I’ve watched teams build beautiful AI automations that work perfectly… until the first real campaign goes out. Suddenly:

every inbound lead triggers three model calls instead of one
the workflow repeats summarisation steps because the data wasn’t stored
sales reps ask for “just one more” enrichment field and the cost curve bends upwards

Compute scarcity forces you to design deliberately. It rewards teams who treat AI calls like money (because, frankly, they are).

Why demand for AI compute keeps growing

The short answer is: AI is useful, and usefulness spreads. The longer answer has a few moving parts that matter for your planning.

1) More users, more use-cases, more “always-on” workflows

AI started as something you tried. Now it’s something you run continuously:

daily lead triage
real-time website chat
support ticket routing
campaign content variations
call summaries and CRM updates

Once people trust an AI workflow, they stop thinking of it as a tool and start treating it like a utility. That shifts demand from occasional spikes to baseline consumption.

2) Models get better… and people raise expectations

When quality improves, behaviour changes. You don’t ask for a short summary anymore; you ask for a structured brief, objections, positioning angles, and a follow-up sequence. Each extra step often means extra inference calls or longer outputs, and that means more compute.

3) Multimodal workloads add pressure

Text was only the beginning. Many teams now want AI to work across:

voice (call transcription and analysis)
images (ad creatives, product images, screenshots)
documents (PDFs, proposals, contracts)
video (timestamps, highlights, compliance checks)

These workloads can be heavier than plain text and can push you into more expensive model choices.

4) Enterprise adoption brings peak-load realities

Enterprise customers don’t use AI “a bit”. They run it at scale, with compliance, audit logs, and SLAs. That creates predictable demand, but also big peaks: quarter-end reporting, large outbound campaigns, seasonal ecommerce surges, or major product launches.

Compute scarcity meets marketing and sales: what changes for you

If you manage automation or revenue operations, compute scarcity affects three things immediately:

Unit economics: cost per lead processed, cost per ticket summarised, cost per meeting analysed.
System design: fewer calls, smarter routing, better storage of intermediate results.
Governance: who can run what, when, and with which model settings.

When I build client workflows in make.com or n8n, I treat compute like a limited budget line. You can absolutely scale AI automations, but you need a few habits that feel boring at first and then feel like a superpower later.

How to expand access to AI benefits without setting fire to your budget

Let’s get practical. If “bringing AI to more people” is your goal, your constraint is usually cost and reliability, not enthusiasm. Here are patterns I rely on.

Pattern 1: Call the model once, then reuse the result

One of the easiest wins is also one of the most overlooked: store outputs.

Example: a “Lead Research & Outreach” scenario.

You fetch a lead’s website text and LinkedIn snippet.
You ask the model for a summary and positioning notes.
You ask again for email personalisation.
You ask again for objection handling.

That’s three or four separate calls. Instead, you can ask for a single structured output in one go (JSON or clearly labelled sections), store it in your CRM or database, and reuse it in later steps.

In my builds, I usually create a field like ai_lead_brief and treat it as the source of truth. You pay once, then you reuse. It’s not glamorous, but it works.

Pattern 2: Route tasks to the cheapest model that meets the bar

Not every task needs your “best” model.

Classification (spam vs real lead) often works with a smaller model.
Extracting fields from a form submission can use lightweight parsing.
High-stakes copy (brand voice, regulated industries) may justify a stronger model.

I like to set up a simple “routing” step:

If the task is deterministic (extract, label, match), I start small.
If the task is creative or sensitive (messaging strategy, claims, legal-ish phrasing), I step up.

You can implement this logic cleanly in both make.com and n8n using conditional branches, and you’ll cut spend without lowering perceived quality.

Pattern 3: Use thresholds and “confidence gates”

Compute waste often comes from treating every input as equally important. In real life, it isn’t.

Here’s a gating approach I use:

Run a cheap preliminary classifier that outputs a confidence score.
If confidence is high, proceed automatically.
If confidence is low, either escalate to a stronger model or send to a human review queue.

This reduces the number of expensive calls and improves safety. You also avoid the awkward situation where your workflows produce confident nonsense at scale (we’ve all seen it, and yes, it’s painful).

Pattern 4: Batch work instead of doing everything in real time

Real-time AI feels great. It also costs you more operationally because you design for peak demand and low latency.

For many marketing tasks, batching works just fine:

Summarise yesterday’s calls at 2 a.m.
Generate weekly account insights every Monday morning.
Enrich new leads every hour, not instantly.

Batching helps you smooth compute usage, reduce timeouts, and keep workflows stable.

Pattern 5: Reduce prompt length and stop sending the same context repeatedly

Long prompts feel safe because they include “everything.” They also cost more and can introduce noise.

What I do instead:

Keep a short system instruction that defines style and formatting.
Pass only the minimum context needed for the step.
Use stored summaries rather than raw transcripts where possible.

If you want one quick rule: don’t pay to send the same paragraphs 50 times a day.

Designing compute-aware AI automations in make.com

make.com shines when you want to build clear, modular automations with strong connector support. Compute-aware design in Make usually comes down to two things: scenario architecture and data storage.

A practical scenario blueprint: “Inbound lead → qualified brief → sales handoff”

Below is a blueprint I’ve used (with variations) for B2B lead handling.

Trigger: New lead in your form tool or CRM.
Deduplication: Check if the email/domain already exists.
Lightweight scoring: Basic rules (company size, country, role keywords).
AI brief (single call): Ask for a structured summary and suggested next step.
Store: Save the brief into CRM fields.
Conditional routing:
- If score high: notify sales with brief + suggested opener.
- If score medium: add to nurture with personalised angle.
- If score low: tag for later or discard.

Two compute savings are doing the heavy lifting here:

You call the model once and store the output.
You only use AI for leads that pass basic filters.

Make.com tips that save you money (and headaches)

Use clear data contracts: store structured outputs so later modules don’t need another AI call.
Set maximum content length when you fetch web pages or transcripts; trim input before sending it.
Add replay protection: prevent repeated triggers from re-running expensive steps.
Log model usage: write token and cost metadata into a sheet or database if your provider returns it.

Designing compute-aware AI automations in n8n

n8n gives you excellent control. When I need fine-grained routing, custom code steps, or self-hosted execution, I often pick n8n.

A practical workflow blueprint: “Support ticket triage with escalation”

This is a solid pattern for customer success and support teams.

Trigger: New ticket in your helpdesk.
Pre-processor: Clean the text, remove signatures, strip long quoted threads.
Classifier (cheap): category, urgency, sentiment, and whether it mentions billing.
Confidence gate:
- High confidence: set tags and route automatically.
- Low confidence: send to a stronger model or a human queue.
One-call summary: create a short internal summary for agents.
Storage: save outputs back to the ticket as internal notes.

In n8n, you can make this even leaner by caching results (for example, keyed by ticket ID + last updated timestamp) so edits or reopens don’t trigger unnecessary recomputation.

n8n tips that keep compute under control

Cache aggressively: store model outputs and reuse them across steps.
Use “continue on fail” carefully: repeated retries can multiply cost if you’re not careful.
Version your prompts: when a prompt changes, you can choose whether to recompute old records or not.
Prefer deterministic transforms (code, regex, mapping) before you reach for a model.

Expanding access: what vendors can do, and what you can do as a buyer

The podcast framing focuses on bringing AI benefits to more people. Some of that sits with AI providers: capacity planning, hardware supply, efficiency work, and pricing models. Still, as a buyer and builder, you have real leverage through architecture and governance.

What AI providers typically do to stretch compute further

Model efficiency: optimising inference so you get more outputs per unit of power.
Smarter scheduling: shifting workloads to balance peaks.
Tiered offerings: giving customers choices that match quality needs to budgets.
Caching and reuse: reducing repeated compute on identical or near-identical inputs.

You don’t control these directly, but you can design your usage to benefit from them.

What you can do: an “AI usage policy” that doesn’t feel like a straitjacket

I’ve helped teams create lightweight policies that cut cost and improve reliability without slowing people down. A good policy covers:

Approved use-cases with a clear owner (marketing ops, rev ops, support ops).
Model tiers: which tasks can use budget models vs premium models.
Data handling rules: what you can send to third-party APIs and what you must redact.
Monitoring: a monthly review of top workflows by volume and spend.

This stops “AI sprawl”, where every team builds separate flows that all do the same thing, badly, with repeat spending. I’ve seen that happen in fast-growing companies, and it’s a bit like leaving the heating on with the windows open.

SEO perspective: why compute scarcity matters for your content strategy

Compute scarcity sounds like an engineering topic, but it shapes the marketing landscape in a few ways that affect your search and content work.

1) AI content production pushes volume up, so differentiation matters more

As AI makes content easier to produce, the internet fills up quickly. That raises the bar for:

original insights
real examples and workflows
specific recommendations tied to constraints (like cost and throughput)

When I write or review SEO content now, I look for a “proof of work” feel: practical steps, clear assumptions, and a voice that signals someone has actually built the thing.

2) Efficient workflows let you publish consistently without burning resources

If your team uses AI for content briefs, outlines, and drafts, you’ll face the same compute trade-offs. You can manage them by:

standardising prompts for briefs
reusing research summaries across related articles
batching content generation and editing cycles

You keep output consistent, and you avoid a messy ad-hoc process where each article starts from scratch.

Cost-aware AI metrics you should track (so you don’t guess)

Compute scarcity becomes manageable when you measure what your workflows consume. These metrics work well for marketing and sales automations:

Cost per workflow run (average and p95; the long tail matters)
Cost per qualified lead (AI spend divided by leads that pass your qualification bar)
Calls per record (e.g., how many model calls per lead, per ticket, per meeting)
Cache hit rate (how often you reuse stored outputs instead of recomputing)
Escalation rate (how often you need a stronger model or human review)

If you track only one thing, track calls per record. It often reveals waste immediately.

Common mistakes that quietly inflate compute spend

I’ll name a few patterns I’ve personally had to clean up—sometimes my own, sometimes a client’s. They’re normal mistakes, but they add up.

Repeating summarisation at every step

Someone summarises a call transcript. Then a downstream step summarises the summary. Then a dashboard step summarises again. Store a single canonical summary and reuse it.

Feeding the model raw, messy inputs

HTML pages, email threads, duplicated signatures, disclaimers—these bloat tokens and degrade results. Clean input first.

Overusing “creative” generation where templates would do

If you want a consistent outreach email, you often get better results by combining:

a tight template
two or three personalised variables
a short value prop library your team already trusts

You reduce variance, lower cost, and your brand voice stays intact. Win-win.

Letting retries run wild

Retries are important for reliability, but they can double or triple spend if you don’t cap them. Add guardrails: max retries, exponential backoff, and alerting when error rates spike.

Practical playbook: building “compute-light” AI automations that still feel premium

If you want a simple playbook you can follow this week, use this sequence. I use it myself when scoping new automations.

Step 1: Define the business output in one sentence

“Reduce lead response time to under 10 minutes with personalised first-touch messaging.”
“Summarise every sales call and write CRM notes in the correct fields.”

This keeps you from adding “nice-to-haves” that multiply compute calls.

Step 2: Identify what must be AI and what must be deterministic

Deterministic: formatting, mapping fields, dedupe, routing by region.
AI: summarising, extracting nuance, drafting copy with context.

I always try deterministic first because it’s predictable and cheap.

Step 3: Collapse AI tasks into one structured request

Instead of three prompts, write one prompt that returns:

summary
pain points
recommended next action
one email opener

Then store it.

Step 4: Add a gate and an escalation path

Decide when the workflow should:

auto-complete
ask for human review
upgrade to a stronger model

This protects quality without turning every record into a premium request.

Step 5: Monitor usage weekly for the first month

In the first month, the workflow is still “new”, and people will push it in ways you didn’t anticipate. I like weekly reviews early on, then monthly once things settle.

How we approach this at Marketing-Ekspercki (and how you can apply it)

In our work at Marketing-Ekspercki, we build AI-enabled automations in make.com and n8n for teams that care about revenue: lead handling, sales support, reporting, and lifecycle messaging. When compute is scarce, I don’t treat it as an abstract industry problem. I treat it as a design constraint, like page speed in SEO or deliverability in email.

Here’s the practical mindset we use:

Value first: we tie every AI call to a measurable outcome (time saved, conversion lift, better qualification).
Fewer calls, better prompts: we prefer one well-specified request over a chain of vague ones.
Reuse outputs: we store and repurpose summaries, briefs, and classifications.
Graceful failure: if the model fails, the workflow falls back to a safe path rather than hammering the API.

You can copy this approach even if you build everything in-house. It doesn’t require fancy tooling—just discipline.

Mini examples you can lift for your own workflows

Example A: Personalised LinkedIn follow-ups without excess generation

Generate a message angle once per account (industry + role + trigger).
Store it in CRM.
Use a template to produce 3 variants with small edits rather than full free-form writing every time.

Example B: Weekly “pipeline narrative” for leadership

Pull structured pipeline data.
Create a deterministic summary table.
Ask AI for a short narrative that explains movements and risks based on the table.

You keep the facts grounded and limit token-heavy raw exports.

Example C: Content brief generation for SEO pages

Batch keywords by topic.
Ask for one brief per topic cluster, not per keyword.
Store briefs and reuse them for supporting articles.

Your editorial system ends up calmer, cheaper, and easier to maintain. Honestly, it feels like tidying your desk and suddenly finding you can think again.

What to take away from the “compute is scarce” message

That OpenAI line—compute is the scarcest resource in AI, and demand keeps growing—reads like an industry headline. For you, it should read like a design brief.

If you build AI automations, you’ll win by using fewer calls, cleaner inputs, and better reuse.
If you buy AI tools, you’ll win by measuring unit costs and aligning model choices with task value.
If you want to widen access inside your company, you’ll win by setting simple policies and giving teams reusable building blocks.

I’ve found that teams don’t need “more AI” to get results. They need better architecture, a bit of governance, and the confidence to keep things simple where simplicity works.

If you want, tell me what you’re automating right now (lead gen, outbound, support, reporting), what tools you use (make.com, n8n, CRM), and roughly how many records you process per month. I’ll suggest a compute-aware workflow outline you can implement without drama.

Wait! Let’s Make Your Next Project a Success