GUIDES04 JUL 20265 min

I burned £700 in one day automating with AI — the one guardrail that would have stopped it

A single missing check turned a routine automation test into a £700 API bill. Here's the mechanic that caused it and the fix that takes ten minutes.

DS

Dil Singh

Bad Boy Labs / Studio

BBL//ARTICLE

Last year I ran a test on an automation I was building. I walked away from my desk. I came back to an API bill that had grown by £700 in a few hours. No server crash, no error message — just a quiet loop doing exactly what I told it to, over and over again.

What actually happened

The build was a document-processing pipeline. It watched a folder, picked up new files, sent them to an AI model for extraction, and wrote the results to a spreadsheet. Straightforward enough.

The bug was subtle. The script marked a job as “done” by writing a row to the spreadsheet. But on restart — which happened automatically when I pushed a small config change — the script re-read the folder, didn’t find its completion record fast enough due to a timing gap, and reprocessed every file from scratch. Then it restarted again. Then again.

Each document sent to the model cost fractions of a penny. Multiply that by hundreds of files, dozens of restart cycles, and a few hours of nobody watching — and fractions become hundreds of pounds.

The mechanic: why loops go wrong silently

AI API calls fail loudly when something goes wrong with the request. They don’t fail at all when you make too many correct requests. There’s no built-in circuit breaker that says “you’ve processed this document three times already, are you sure?”

That’s your job. And the tool for it has a name: idempotency.

An idempotent operation produces the same result whether you run it once or a hundred times. In plain terms: your system should be able to look at any piece of work and answer the question “have I already done this?” before it does anything expensive.

The fix: a processed-IDs ledger

The repair took about ten minutes once I understood the problem. Here’s the pattern, described plainly so you can apply it yourself:

Assign every piece of work a stable ID. For a file, that might be its name plus a hash of its contents. For a database row, it’s the primary key. The ID must not change between runs.
Write the ID to a ledger before you do the expensive work, not after. If you write it after, a crash mid-process leaves no record and you’ll reprocess. Write it first, with a status of “in progress”, then update to “complete”.
Check the ledger at the very start of each loop iteration. If the ID is already there with status “complete”, skip it. Full stop. Don’t re-evaluate, don’t re-fetch, skip.
Make the ledger durable. A variable in memory disappears on restart. Use a database table, a flat file, a spreadsheet row — anything that survives a process restart.

That’s it. Four steps. If I’d had this in place, the restart cycle would have checked the ledger, found every file already marked complete, and processed zero documents. Bill: pennies.

Run the numbers on your own setup

You don’t need to be processing thousands of documents for this to matter. Work through it:

How many items could your automation process in an hour if nothing stopped it?
What does one API call cost? (Check your provider’s pricing page — it’s usually listed per 1,000 tokens or per call.)
Multiply: items per hour × cost per call × hours you might be away from your desk.

If that number makes you uncomfortable, you need idempotency checks and a spend alert. Most API providers let you set a hard monthly cap or an email alert at a threshold. Set both. The alert tells you something is wrong; the cap stops it costing you more while you sleep.

The broader lesson

AI automation fails expensively in a way that traditional software doesn’t. A broken database query throws an error. A runaway AI loop sends you a bill.

The systems we build at Bad Boy Labs — like the document-extraction pipeline we run for a UK supply-chain operator, which saves conservatively 30 minutes per document — all carry idempotency checks, spend alerts, and a manual kill switch as standard. Not because we’re cautious by nature. Because we’ve seen what happens without them.

If you’re building something yourself, add the ledger before you go live. If you’re commissioning a build, ask the person building it: “What happens if this restarts mid-run?” The answer should be immediate and specific. If they hesitate, that’s useful information.

One bad afternoon taught me that guardrails aren’t overhead — they’re part of the build. Budget them in from the start.

If you want to see how we structure builds to avoid exactly this kind of failure, the details are at our pricing and scope page.

I burned £700 in one day automating with AI — the one guardrail that would have stopped it

What actually happened

The mechanic: why loops go wrong silently

The fix: a processed-IDs ledger

Run the numbers on your own setup

The broader lesson

More from the workshop floor

Google Business Profile: the 30-minute setup that beats most SEO spend

How to write down a process nobody has ever written down

The follow-up email that never gets sent (and the system that sends it)