How I Bit Off More Than I Could Chew Building This SaaS — Even With the "Best" LLMs

Part 1 of 5: The BenchBoard Build - The Honest Truth About Building with AI

Mar 12, 2026

I’m Rad. Founder of BenchBoard. Engineer. Dad. Youth Coach. I’ve been writing code for decades — starting on a Commodore 64 with programs I copied out of library magazines onto cassette tapes. This series is about what actually happens when a veteran developer tries to build a real product with AI tools. Not the highlights reel. The whole messy truth.

I’ve been building BenchBoard since November, but the truth is the planning started way back in July. What began as a clean, exciting idea — a team management and scorekeeping app for youth baseball and softball — slowly evolved into a full-blown beast. The kind of beast you don’t fully understand until you’re already riding it downhill with no brakes.

And here’s the part I didn’t expect: using LLMs made the journey both faster and way more chaotic.

If you’ve spent any time on tech Twitter or LinkedIn, you’ve seen the posts. “I built a SaaS in a weekend with AI!” “No-code to $10K MRR!” I’m not here to tell you those people are lying. Maybe they built something. But I’d bet my last Docker container it wasn’t handling live scorekeeping for a 12U softball game where a guest player shows up wearing the same jersey number as your starting shortstop.

This series is for the people who want the real version of the story.

The First Big Lesson: Never Fully Trust the AI

When I started, I leaned heavily on ChatGPT. At the time, it felt like a superpower — instant architecture suggestions, instant code, instant explanations. But LLMs evolve fast. What they tell you in July isn’t what they tell you in November. And what they tell you in November isn’t what they tell you in January.

I learned the hard way that if you let the AI dictate how your system should operate, you’ll end up rebuilding major chunks of your app over and over. That happened to me more than once. Not because the AI was “wrong,” but because it doesn’t understand the full context of your system, your constraints, your users, or your long-term vision.

It’s confident. It sounds right. But it doesn’t know.

Here’s a simple example. Early on, I asked the AI how to structure my data layer — how teams, players, games, and lineups should relate to each other. It gave me a clean answer. Three separate tables for managing players, batting order, and defensive positions. Looked great on paper. Months later, I realized those three tables needed to be one — a single LineupSnapshot that holds everything atomically. The AI’s original suggestion wasn’t wrong for a textbook exercise. But it was wrong for a real app where a coach is juggling lineup changes on their phone while kids are warming up on the field.

That refactor took days. Not because the code was hard to write — the AI helped me rewrite it fast. But because the decision to consolidate those tables required understanding how coaches actually use the app mid-game. No LLM has that context. I did, because I’ve been that coach.

The Scope Problem (AKA: The Monster I Created)

Another mistake: I didn’t properly scope what I was building early on. Every time I solved one problem, I discovered three more. Every time the AI suggested a “better” approach, I’d rethink the architecture. Before long, the scope ballooned.

Let me give you a taste of what I mean. Take something that sounds simple: “Let the coach enter the opposing team’s lineup before a game.”

Sounds like a form with some text fields, right?

Here’s what it actually requires:

How do you split a full name into first and last when someone types “Mary Jo Smith”?
How do you deduplicate players when jersey numbers aren’t reliable because guest players rotate in and out?
Should the coach pick from a dropdown of known opponent players, or type fresh names every game?
If a coach corrects a player’s name mid-game during scorekeeping, does that correction propagate back to the opponent’s player record for future games?
What happens when a player is removed from the lineup — do you delete the player record entirely, or just remove them from the game?
What about prefilling positions from the last time you played this team?
What about jersey number conflicts — two rows with #7?

That’s one feature. And we haven’t begun put the damn code in yet! I had a vision of doing a cool drag-and-drop feature for the batting order that would be extremely helpful for coaches - but you also need to think of what happens when the lineup changes before a live game start. What happens when David or Grace shows up last minute to play? Every single one of those gaps was something I identified through domain knowledge — from standing on actual fields, watching actual coaches fumble with clipboards and group texts.

The AI never surfaced a single one of those edge cases on its own.

Not one.

Eventually I had to force myself to define a hard cutoff — a line where I’d stop adding, stop rethinking, and start shipping. That discipline didn’t come naturally. It came from pain.

What the AI Actually Fixed (And It’s Not Nothing)

I don’t want to sound like I’m bashing AI tools. I’m not. They fundamentally changed how I work, and I wouldn’t go back.

Here’s what shifted. For the first 35-plus years of my career, when I hit a wall — a syntax issue, a connectivity problem, a framework quirk — I’d open a browser and start searching. Stack Overflow. Blog posts. GitHub issues. Sometimes I’d spend 45 minutes just figuring out the right search query to describe what I was dealing with. Then I’d read through six different answers, figure out which one applied to my specific stack, adapt it, test it, and move on.

That entire workflow is mostly gone now. The LLM handles the tedious stuff — the boilerplate, the syntax lookups, the “how do I connect X to Y” problems — in seconds. It’s like having a junior developer who has read every documentation page ever written, sitting right next to you, never getting tired. But this developer doesn’t know everything - only what you tell it to do.

And here’s the thing nobody tells you about that time savings: it doesn’t make your project smaller. It makes you faster at discovering how big your project actually is.

All that time I used to spend on Stack Overflow? Now I spend it on architecture decisions, domain logic, and scope management.

The boring problems got automated.

The hard problems got exposed.

And the hard problems are harder — because no AI can make those calls for you.

A teacher teaches coding concepts to a student

Programming Knowledge Still Matters — A Lot

We live in a world where people genuinely believe you can build a production app without deep programming knowledge. And honestly, you can build something. You can get a landing page, a database, some API routes, maybe even a working prototype. Take a few minutes on Starter Story and you’ll see developers and normies put up stuff quickly when comes to solving quick problems and honestly, I have no issues with those guys. More power to you.

But if you’re building something complex — something that will eventually serve hundreds or thousands of real users in real time — you still need to understand how systems behave. How databases work. How entities relate. How state flows across web and mobile. How APIs communicate. How your customers will actually use the thing.

LLMs can generate code, but they can’t architect your system for you. They can’t foresee the scaling issues. They can’t understand the nuance of your domain. They can’t tell you when something “feels wrong.”

That’s your job.

And if you don’t understand what the LLM is doing — or why — you’re flying blind. And that’s a big one many new programmers and so-called vibe coders romantically fall hard on.

I’ve seen the AI confidently generate code with race conditions baked in, where two different parts of the system were writing to the same database record simultaneously, and whichever request finished last would silently overwrite the other. If you don’t know what a race condition is, you’d never catch it. You’d just wonder why your app randomly loses data. Copying and pasting that log over and over to the LLMs chat box won’t bail you out.

That’s not a hypothetical. That happened to me. And I’ll tell you the full story in Part 2.

Where This Leaves Me Now

Looking at white board trying to make a decision

BenchBoard has grown into something far bigger than I expected. It’s been frustrating, hilarious, exhausting, and incredibly rewarding. I’ve rebuilt parts of it multiple times. I’ve learned more in the last several months than I expected to learn in a year.

But the biggest takeaway?

AI is an accelerator, not a replacement. It can help you move faster, but only if you know where you’re going.

And if you don’t — it’ll happily lead you into a ditch with a smile.

Next up — Part 2: “The Architecture the AI Couldn’t See.” I’ll walk you through the night the AI built a race condition into my live scorekeeping system - full code and everything. It confidently proposed a band-aid fix and I had to stop everything and redesigned the entire data pipeline myself.

It’s the story that made me realize: the AI is a great passenger, but you’d better not let it drive.

If this resonated with you — whether you’re a fellow builder, a developer who’s been in the game for decades, or someone just curious about what building real software actually looks like behind the curtain in today’s AI-driven world — subscribe. This is a five-part series, and it only gets more real from here.

Radical Insider

Discussion about this post

Ready for more?