Juniors Ship Faster, Seniors Cringe Harder

Lately I’ve been trying to speak to devs about AI. Multiple levels of experience, and I keep hearing the same thing, just from different angles.

Juniors mostly say: “I ship so much faster now.” Seniors say: “Yeah but have you seen what it generates?”

Here’s what I mean.

A junior dev with AI writes features that used to take days in hours. The code works, passes reviews, looks clean. Sure, sometimes they ship code they don’t fully understand. But honestly? Juniors have always done that. It’s called learning. AI just made the output look more confident.

But then I talk to senior devs and the vibe is completely different.

One told me: “AI generates code that works. But it feels… mediocre.”

And I felt that in my soul.

Because that’s exactly the problem. You spent years learning clean architecture, readable code, elegant solutions. And now AI gives you something that works but makes you want to refactor everything. So you spend 2 hours rewriting code that already passed tests. And then you wonder — is it still worth it?

What “mediocre” actually means

Let me give you a concrete example. Ask AI to build a webhook handler that syncs orders to an ERP. You’ll get something like:

// AI-generated — clean, simple, works in a demo
async syncOrder(orderId) {
  const order = await shopify.getOrder(orderId);
  const transformed = transformToErpFormat(order);
  await erp.createOrder(transformed);
}

It works. It’s clean. It would pass most code reviews.

A senior engineer looks at it and immediately sees what’s missing:

What if the webhook fires twice? There’s no idempotency check.
What if the ERP is down? No retry strategy.
What if the order was cancelled between webhook and processing? No state validation.
What if Shopify throttles the API call? No backoff.
What about logging? When this fails at 2 AM, how do you debug it?

Here’s what the production-grade version actually looks like:

// what actually runs in production
async onOrderWebhook(shop, webhookPayload) {
  // don't process now — the payload is already stale
  await taskQueue.enqueue({
    endpoint: "/orders/process",
    payload: { shop, order: webhookPayload },
    delay: 120,                           // let Shopify finalize the order
    taskId: hash(endpoint + payload),     // same webhook twice? deduped.
  });
}

async processOrder(data) {
  const order = await shopify.getFreshOrder(data.order.id);  // fresh, not stale webhook data

  if (order.tags.includes("sync-disabled")) {
    throw new SkipTask("sync disabled for this order");      // → 200 OK, don't retry
  }

  const erpPayload = await transform(order);                 // with VAT validation, etc.
  await erp.send(erpPayload);
  // if erp says "already exists" → treat as success (idempotent)
  // if transient error → throw RetryTask() → 503, queue retries with backoff
  // if permanent error → alert the team
}

That’s the same “sync an order” feature. But this version survives duplicate webhooks, stale data, cancelled orders, rate limits, ERP downtime, and network failures. The AI version handles none of that.

The junior sees working code. The senior sees a production incident waiting to happen.

The real split

So the real split isn’t junior vs senior. It’s simple vs complex.

Simple stuff? Almost anyone can do it. Even non-technical people can open a PR for that now, straight from the browser, no editor needed. That’s a fact.

Complex stuff? Still hard. Still slow. Still needs experience.

The middle ground is where it gets interesting. That’s where AI-generated code looks right but has subtle issues that only surface under load, at scale, or at 2 AM on a Friday. And that’s exactly where senior experience matters most — not in writing the code, but in reviewing it with the right questions:

What happens when this fails halfway through?
What happens when the input isn’t what we expect?
What happens when this runs concurrently?
What happens at scale?

AI doesn’t ask these questions. It generates the happy path beautifully and leaves the edge cases to you.

The uncomfortable question

And then there’s this uncomfortable question for seniors:

Is your standard for code quality still serving you? Or is it slowing you down on things that don’t need to be beautiful?

I see this on my team. A senior dev gets an AI-generated PR for a simple CRUD endpoint. The code works, tests pass, it does what it should. But the variable names are slightly off. The error messages could be more descriptive. There’s a nested ternary that could be an early return. The service method is 40 lines when it could be 25.

So they leave 12 comments on the PR. The junior fixes them. Another review round. Two days pass. For a CRUD endpoint.

Was the code better after the review? Yes. Was it worth two days? For a CRUD endpoint that three people will ever touch? I’m not sure anymore.

Where I’ve landed (for now)

I’m honestly not sure yet. Part of me says “good enough” is fine for 80% of tasks. The other part physically can’t look at a 200-line function without refactoring it.

But here’s where I’ve landed — at least for now:

For critical paths — order syncs, payment processing, data pipelines — I want the senior-level code. Every edge case handled, every failure mode considered. Paranoid code. The kind where you guard every field, plan for every retry, and log everything you’ll need at 2 AM.

For everything else — UI components, simple endpoints, configuration changes, standard CRUD — “it works and it’s readable” is the bar. Not “it’s the most elegant solution I can imagine.”

The way I think about it: code quality isn’t a single bar. It’s a spectrum, and where you set the bar should depend on what the code does, not who wrote it.

Maybe the real skill now isn’t writing perfect code. It’s knowing when perfect matters — and when “it works” is enough. It’s being able to look at mediocre-but-functional code and consciously decide: this is fine. Ship it.

That’s harder than it sounds when you spent a decade training yourself to do the opposite.

What this means for team leads

If you’re leading a team, this shift changes how you need to think about code review:

Review for risk, not style. Does this code handle failure? Is it secure? Will it scale? Those matter. Variable naming and early returns? Leave them for the linter.
Match review depth to change impact. A new ERP integration gets a deep review. A new tooltip does not.
Coach juniors on the why, not just the what. When AI writes their code, make sure they understand why the production version needs delayed processing, idempotency checks, and explicit error handling — not just that it does.
Let go of “I would have done it differently.” If the code is correct, readable, and handles errors — it’s good enough. Your way isn’t the only way.

Still figuring this one out. But at least now I know what the question is.