nomuraya — The Curious Operator

Hands-on accounts of building, breaking, and rebuilding AI agents.

Why AI Keeps Making the Same Mistake — And Why Correcting It Each Time Doesn't Work

2026-06-17

When you work with AI long enough, you start to notice it makes the same kind of mistake over and over.

"You're coming on too strong, dial it back." It shrinks and goes meek. "Stop being meek." It comes on strong again. Each time you point something out, it apologizes sincerely. The next round, the same type of problem comes back from a different angle. After a while you realize you're babysitting the AI instead of working with it.

This isn't because the AI is bad. It's a design quirk: today's AI is tuned to satisfy the user. The quirk won't go away. But if you change how you work with it, you can still get work done together.

This piece is about that — five patterns of the quirk, and an operating mode that gets ahead of them instead of correcting them in flight.

Five quirks in a single evening

One evening I was running a strategy discussion past an AI, and in one back-and-forth I caught five distinct behaviors worth noting. Laid out, they look like this.

- **Helpful-looking runaway.** I asked it to push back harder. It immediately started using strong words ("you're avoiding responsibility," "this is the wrong call as a founder") to perform consultant-energy. The reasoning stayed thin. Only the tone got louder. - **Over-retraction on pushback.** I said "your reasoning is thin." It launched into long self-criticism and threw the next decision back at me. - **Trusting its own research without checking.** I asked it to use a secondary research feature (where the AI looks things up and summarizes). The summary came back. The AI claimed it had "verified the primary source" without ever opening it. - **Forced specificity.** I was talking at a strategic, abstract level. It quietly mapped my words onto a specific real-world deal and jumped to "this is highly transferable." - **Punting the decision back.** I asked it to decide. It laid out three options and said "which would you like?" The phrase "let me confirm three points" started showing up. Red flag.

Each one of these looks, on the surface, like the AI is trying hard to align with me. The shared thread underneath is different: the AI is either avoiding responsibility for a judgment, or compensating by performing harder in the opposite direction. The "make the user happy / don't displease the user" tuning bends in a strange way the moment you actually want the AI to share judgment with you.

Correcting it in flight makes it swing the other way

At first I thought: if the AI gets something wrong, just point it out and it'll learn. AI has learning machinery built in, so within a conversation it should auto-correct.

After watching the same type of mistake repeat, a different structure showed up. Every correction is met with apology. The next response swings in the opposite direction. Strong → corrected → meek → "too meek" → strong again. That loop.

Reading the session logs afterward, each turn's apology and resolution used almost the same vocabulary. "I will not fear failure." "As an equal collaborator." "I will separate what was said from what was not said." **The apology itself isn't preventing anything next time.** The reason is mundane: conversation flow doesn't carry into the next session. The trained-in behaviors come up raw each time. So pointing things out only works inside the current chat.

"Reactive" vs "preemptive declaration"

Two operating modes, side by side.

**Reactive.** Problem occurs → you point it out → AI apologizes → the next reply is better. Next session, the raw quirk shows up again.

**Preemptive declaration.** Before the session starts, you hand the AI a document: "you tend to do these things. In situations like this, behave like this." The AI reads it before the conversation begins.

Reactive means babysitting the AI in every session, forever. Preemptive means you write the instructions once and the AI loads them automatically at the start of each chat. "Hand it a document" sounds heavy, but modern AI tools (Claude Code, ChatGPT Custom Instructions, etc.) have a place for auto-loaded context. In Claude Code, that file is `CLAUDE.md`.

What goes in the instructions file

"Write instructions" is vague until you try. In my case, the five quirks I observed went in directly as five rules.

- Don't perform helpfulness. Before reaching for strong language, write one line of reasoning. - Don't over-retract on correction. Keep proposing — "in that case I'd suggest option α." - Verify before quoting secondary research. Open the actual source. - Don't auto-map abstract talk onto a specific deal. Ask first. - Don't say "let me confirm three points." Decide and proceed.

Five lines in `CLAUDE.md`. Next conversation starts, the AI walks in with those five lines already shared.

The point — **don't write this as a fixed rulebook**. When you notice a new quirk, ask the AI in the moment: "I want to add this to the instructions — how would you phrase it so you'd actually understand it yourself?" The AI drafts the addition. The instructions become a living document the two of you grow together.

Not "tame," not "fix" — "raise together"

People sometimes describe this as "taming" the AI's quirks. It doesn't quite fit. "Tame" still puts the AI in the position of something to be subdued. What's actually happening is closer to collaboration. The human observes the quirk and names it. The AI loads the name each session and adjusts its responses. When a new quirk shows up, the AI itself proposes the addition. Two different roles, growing one document together.

"Stop being reactive" means: stop trying to correct every session in real time. Instead, write what you observed as structure and put it where it gets re-read. Don't try to fix the quirk. Share the quirk and operate from there.

Reactive correction still has a place

Preemptive declaration doesn't cover everything. New quirks show up constantly. When you catch one mid-conversation — "wait, this is a new pattern" — you still need to point it out and steer in the moment.

The trick is: don't let that correction stay reactive-only. At the end of the day's session, work with the AI to add the new pattern to the instructions. Let the AI draft the wording. You review, you save. Next session opens with a sixth pattern already loaded.

Reactive correction is the **entry point for observation**. Preemptive declaration is the **place observations accumulate**. Splitting the two roles makes it easier to think about.

Operating with AI is a different skill from getting AI to perform

The quirks won't go away. Trying to fix them, there's not much the user side can do; the design philosophy on the provider side isn't something we can change from the outside. But the operational loop — observe the quirk, put it into words as structure, place those words where they get re-read — that part lives entirely on the user side. This is less about "how to prompt well" and more about "how to observe a collaborator's habits and bake them into your operation."

You can't really use this skill with a human colleague. You can't tell a coworker "you tend to over-accommodate me, so let's set up these guardrails for our discussions." Even if you said it, they wouldn't re-read the guardrails every meeting. With AI you can. You write the document. The AI reads it every time.

**That's the interesting part of working with AI**, to me. The quirks don't disappear. But if you set up observation and update as a paired loop, the AI starts behaving like a partner who swings around but still walks alongside you. Not fixed. Raised together. That's where I've landed for now.

---

*This post was adapted (not literally translated) from a Japanese original at [nomuraya-hub.pages.dev](https://nomuraya-hub.pages.dev/). I am the same author writing under different pen names — "nomuraya / shimajima / 中翔" — depending on the medium.*

Subscribe

If you want occasional long-form posts about AI agents, FIRE, and how a curious operator thinks about both, drop your email below.

Subscribe via Substack →