I build calculation software for engineers, so I'll put my bias on the table up front: I have a horse in the race. But the pattern I want to talk about is bigger than any one tool, and it's been true for a lot longer than AI has been around.
Engineers build things. Not just the structures we're paid for. We build tools. Give an engineer a repetitive calculation and a free afternoon and they'll build something to do it for them. The medium changes every decade or so. Spreadsheets, then spreadsheets with VBA, then Python scripts, and now AI-generated, "vibe-coded" little apps that didn't exist last year. The technology keeps moving. The impulse never does.
And the thing being solved has stayed exactly the same throughout. An engineer has a problem, the off-the-shelf software doesn't quite fit, so they build something that does. Whether that something is a 1998 spreadsheet or a 2026 AI-generated app is almost incidental. The craft instinct is the constant: "I can build the thing I actually need."
And so is the catch
Every one of those home-built tools has the same weakness. It's only as trustworthy as the person who built it, and that person is fallible, in a hurry, and personally liable for the result. The spreadsheet with a dragged formula that didn't copy down. The VBA macro with the hardcoded value someone forgot about. The Python script with the unit error buried three functions deep. None of these are exotic failures. They're the daily reality of engineer-built tooling. And the only thing that ever caught them was a human in the loop. Someone who looked at the output and knew, from experience, whether it was sensible, and could trace it back to check.
That verification layer has always been the real engineering. Not the building of the tool, but the trusting of it. Anyone can produce a number. The value was always in someone qualified being able to stand behind it.
Why this matters more now, not less
AI is extraordinarily good at producing output that looks right. Fluent, well-formatted, confident, plausible. That's a genuine leap, and it's why building tools has never been faster. But it changes the failure mode in a way I don't think we've fully reckoned with.
When an engineer-built spreadsheet was wrong, it often looked wrong. The formatting was rough, the number was obviously off, something didn't add up. Bad work announced itself. The plausibility of the output was, quietly, a safety feature.
AI removes the tell. The output is polished and authoritative whether it's right or wrong. The gap between "looks correct" and "is correct" has never been wider, and it's never been harder to see. A confidently wrong answer that's beautifully presented is far more dangerous than an obviously rough one, because it sails through the instinctive sniff test that used to catch errors.
So I keep arriving at the opposite of the common assumption. The assumption is that as AI gets better, the human matters less. I think it's the reverse. The better AI gets at plausibility, the more essential it becomes to have a fast, rigorous, defensible way for a human to verify what it produced. As production gets cheap and convincing, verification gets more valuable. It's the only thing standing between a plausible answer and a correct one. In technical fields especially, liability guarantees this. Someone has to stand behind the result, and you cannot delegate that to a model.
What the tools need to grow into
Not automating the engineer out of the loop. That optimises for the thing that was never scarce. The harder, more interesting job is to live in the new world, AI-native, fast, a first pass in seconds, while making the human's part fast and defensible rather than a bottleneck. The checking, the tracing, the signing off.
The model I keep coming back to is how software teams already handle this. On GitHub, an AI can review your code, but the review is labelled as the AI's, and a human still approves the merge. You can see who reviewed what. Engineering calculations need the same thing. AI as one reviewer among others, clearly flagged as AI, with every pass, human or machine, attributed and traceable. That's what makes an AI reviewer safe to use on work someone has to stamp. The provenance is the point. An AI review you can't trace is just another confident, plausible output. An AI review you can trace is a genuine second set of eyes.
That's the direction we're building toward. A direction, not a destination I'm claiming to have arrived at. But it's less a product opinion than a continuation of something that's been true since the first engineer built the first spreadsheet to save an afternoon. The building was never the hard part. Trusting the result was.
I'd genuinely like to hear from people deeper in the technical work than I am. Are you finding the review burden going up as the tools get better, spending more energy checking plausible-looking output than you used to spend producing it? And where do you think that leaves us?
Hit me up directly at tim@calctree.com
%20(1).jpg)
