Your AI Tools Are Drifting. Here's Why That's Your Job to Fix.
If an AI tool only works when you remember to check it, the checking is part of the system. Drift is normal, but neglect is optional.
Quick Pour
Most AI tools don't fail with a bang. They fail with a slow, polite decline over weeks or months, continuing to give you answers the whole time, looking entirely functional from the outside, quietly becoming a little less accurate and a little less useful until one day you realise you've been building decisions on something that stopped properly matching reality some time ago. That's the nature of drift, and it's the part of AI tool ownership that almost nobody budgets for at the start.
We've been living inside it this month. Nothing broke in the dramatic sense: no system went down, no customer-facing failure, nothing that would have showed up on a status dashboard. But we've spent significant time tightening and retraining tools that were genuinely good when we built them and then started to quietly slip. The refinement work this week was slower and less glamorous than building something new, and it taught us more than most new builds do.
Drift isn't a technical concept, it's a management concept
IBM defines model drift as the degradation of machine learning model performance due to changes in the data or in the relationships between input and output variables. (IBM) That definition comes from classical machine learning, but the pattern shows up in small-business AI work even when you're mostly working with LLMs and prompt-based systems rather than trained models. Your product range changes, so older defaults become wrong. Your pricing changes, so templates built around previous numbers go stale. Your customers start asking different questions, so your support bot starts missing the point in ways that are hard to spot until you read through a few conversations carefully. Your team changes and people start using a tool in ways you never designed for, which is not their fault at all. It's just what happens when a tool meets reality over time.
In other words: your business changes constantly, and the tool doesn't update itself. That gap is drift. And the reason it's a management concept rather than a technical one is that closing the gap is a human job. It requires someone with enough context to notice the tool is off and enough ownership to do something about it.
The uncomfortable truth: if you ship it, you own it
The danger is the slow drift into assuming something is working when it actually needs review. At this stage of AI adoption, the human is still primarily the sense-checker: the person with instinct, with deep product knowledge, with enough of a feel for the customer to know that an answer sounds plausible but isn't right. Those are the kinds of judgements a tool can't make about itself. And the line that stuck with us is blunt: it's on brand owners, MDs, and CEOs to train the AI tools to care about the details as much as they do.
That's not a call for micromanagement or for treating every AI output as suspect. It's a call for ownership. Somebody in the business needs to hold the relationship with each tool the same way you'd hold a relationship with a member of staff who's good at their job but needs direction to keep improving. Left entirely to their own devices, they'll keep doing what they've always done even when the context has changed. The feedback loop is the management act. Without it, you don't have a system, you have a tool that occasionally misbehaves and nobody notices in time.
Why the ROI stats look so depressing
When people say "AI isn't delivering ROI," a meaningful chunk of that is actually "we didn't operationalise maintenance." You can spend weeks building something genuinely useful, ship it, and still get zero durable value if nobody defined who owns it, how often it gets reviewed, what good output looks like, and where failures get logged. Without those four things you don't have a working tool, you have a working tool for a while. McKinsey's State of AI 2024 report notes that only 46 out of 876 respondents report a meaningful share of EBIT attributable to generative AI. (McKinsey) It's not because the tools are useless. It's because the system around the tools is what actually generates durable value, and the system is much harder to build than the tool itself.
That's why so much AI spend stays experimental. Experimental is what you are when you haven't yet built the review structure. A tool without governance is a pilot that never graduates, and pilots don't show up in EBIT.
A simple drift checklist for small teams
This is the structure we keep coming back to, and it's deliberately unglamorous.
| Drift control | What it looks like in a 6-person company | Why it matters |
|---|---|---|
| Named owner | One person responsible for quality, not "everyone" | Without ownership, drift is invisible |
| Review cadence | 30 minutes weekly, plus a deeper monthly review | Prevents slow decay becoming normal |
| Failure log | A shared doc: "tool did X, should have done Y" | Gives you training material and pattern recognition |
| Change triggers | New product, new price list, new channel, new regulation | Forces updates when the world changes |
| Human sign-off | Critical outputs require a human check | Stops silent errors turning into customer pain |
None of those items are technically demanding. All of them require the kind of discipline that busy teams quietly drop when the tool seems to be working fine. That's exactly when you need them most.
The point isn't perfection, it's feedback loops
We're not trying to build AI systems that never get things wrong. We're trying to build systems that get corrected quickly when they do, and that improve over time because there's a real feedback mechanism behind them rather than just a hope that they'll keep working. That's the recursive improvement piece: you build a structure in the human process first, you define what good looks like and where failures get captured, and then you get the AI tool to behave inside that structure. Without the human structure in place first, the AI optimises for the wrong thing or just gradually optimises for nothing in particular as its context drifts.
The instinct and the deep customer and product knowledge that brand owners carry in their heads is not a legacy asset that AI will eventually replace. At this stage, it's the essential input that makes AI tools useful at all. It's the thing that knows when an answer sounds right but isn't, or when the tool is technically doing what you asked but missing what you actually needed. Training AI tools to care about those details, and maintaining that training as the business evolves, is still a very human job. And it probably will be for a while yet.
Cross-link: this is why we built the strategy engine
This topic connects directly to our strategy engine work this week. A tool that isn't reviewed is a liability rather than an asset, which means maintenance time needs to be included in the ROI calculation from the beginning, not treated as something future-you will handle. Future you is already busy. The strategy engine forces that maintenance cost into the build decision upfront, which is part of why we've started being more selective about what we actually commit to building. You can read more about how that evaluation process works in the strategy engine post.
Frequently asked questions
What is model drift?
Model drift is when a model's performance degrades because the world, data, or input-output relationships change.
Do LLM tools drift too?
Yes. Even if you're "just prompting," your business context changes and the tool can quietly become less accurate or less useful.
How do you stop drift in a small business?
Treat review as part of the workflow: ownership, cadence, logging failures, and updating prompts or training data based on real cases.
Who should own AI tool quality?
The business owner or functional lead. They have the instinct to know when a tool is quietly going wrong.
Robert Berry is co-founder of Asterley Bros, a London-based premium aperitivo company, and Absolution Labs, an AI automation consultancy for drinks businesses. He makes vermouth by day and builds AI systems in the margins.