I use AI on almost every project now. It saves hours. It also ships embarrassing mistakes if I let it run without guardrails.
On DealGPT the models helped draft agent configs and early UI states. On Zesty they were useful for menu copy and internal docs. Neither product would be better if I had pasted model output straight into production. The useful part was speed on the boring bits, then my team arguing about what actually mattered for restaurant staff at 8pm on a Friday.
This is how I think about AI-assisted SaaS development after ten years of building web and mobile products: AI is a fast intern with confidence problems. You still need judgment.
What "AI slop" looks like when you are building
Slop is not just bad blog posts. You see it in products too.
A landing page that mentions "streamline workflows" and "unlock insights" but never says who pays or why. A dashboard with six charts and no obvious next step. Code that demos fine until two accounts try to access the same record. Docs that answer questions nobody asked.
You have probably shipped some of this. I have. Usually right before a founder realizes the MVP needs another two weeks and the AI-generated scope doc was lying.
Symptoms I watch for
Landing copy with no buyer
Reads like a template. No role, no job, no pain.
Feature lists from competitors
Scope grows before anyone validates the core loop.
Charts nobody asked for
Pretty analytics, unclear what to do next.
Demo-grade permissions
Works in a screen recording. Breaks with real orgs.
SEO articles with no author
Correct grammar, zero reason to trust the advice.
Give AI a job title, not the whole company
The failure mode I see most: one person asks the model to be PM, architect, designer, engineer, QA, and marketer in a single thread. You get a coherent document and an incoherent product.
What works better is narrow tasks. Cluster interview notes. Propose three schema options with tradeoffs. Draft test cases for a billing webhook. Rewrite an empty state so it tells the user what to upload. Summarize fifty support tickets by topic.
You decide what is true, what ships, and what gets cut.
AI generates options; humans decide what ships
Start with evidence, not prompt engineering theater
The best outputs I get are never from clever system prompts. They come from messy real inputs.
Last week that might mean twelve interview snippets, a Loom of someone failing onboarding, or a spreadsheet of manual steps a client still does in Excel. Strip anything sensitive, paste the substance, ask a focused question.
The input changes everything
- Give me SaaS ideas
- No customer context
- Generic feature brainstorm
- Feels productive, teaches nothing
- Here are notes from 12 agency owner calls
- Which pains repeat? Quote their words
- Group into jobs, not features
- Three MVP options I can validate next week
Weak:
Give me ideas for a SaaS app.What I send instead:
Here are 12 customer interview notes from agency owners who struggle with client approvals.
Identify repeated pain points, quote the customer language, group the jobs-to-be-done, and suggest three focused MVP workflows. Do not suggest features unless they directly solve a repeated pain.Use AI to shrink scope, not inflate it
Founders love asking "what features should my app have?" Models love answering with a thirty-item backlog.
Flip the question. Ask what you can remove and still deliver the first useful outcome. For a client approval tool, the loop might be:
- Agency uploads a deliverable.
- Client gets a private review link.
- Client approves or leaves structured feedback.
- Agency sees status without chasing email.
That is buildable. It is also marketable. Compare that to a generic "project management platform." See the SaaS MVP checklist for how I cut scope on client projects without cutting learning.
Narrow from feature list to one complete workflow loop
A discovery workflow that survived contact with reality
This is close to what I run on new products:
- Dump raw notes from calls, chats, reviews.
- Remove private data.
- Ask the model to cluster pains and pull exact phrases people use.
- Turn clusters into jobs-to-be-done, not feature names.
- Rank by urgency, frequency, and whether someone will pay this quarter.
- Sketch two or three narrow workflow options.
- Stop and talk to humans again.
Step seven is non-negotiable. AI organizes evidence. It does not know your market.
AI-assisted product discovery pipeline
Architecture: boring on purpose
Ask a model for "SaaS architecture" and you often get Kubernetes, event buses, and three databases. For most MVPs I have shipped, that is future pain disguised as progress.
My default stack thinking:
What I reach for first
App structure
Modular monolith until traffic or team size forces a split.
Database
Postgres. Add Redis or search when queries hurt, not before.
Auth
Proven provider or framework. Custom auth is a full-time job.
Background jobs
Queue for email, webhooks, files. Workers when volume demands it.
Files
S3 or R2. CDN when download volume actually shows up.
Every extra service is maintenance on a small team. I have regretted premature splits more than I have regretted a well-structured monolith.
Code from models still gets a code review
I treat generated code like a fast junior: useful, inconsistent, occasionally dangerous.
Before merge I always re-read auth, authorization, billing, file uploads, webhooks, background jobs, and anything that touches another tenant's data. On multi-tenant SaaS that is where demos die.
What I verify before ship
- Sessions expire. Password reset works. MFA matches what marketing promised.
- Every query is scoped server-side, not just hidden in the UI.
- Stripe webhooks are idempotent and subscription state matches reality.
- Uploads have size limits and access rules, not public URLs by accident.
- Failed jobs surface somewhere you will actually look.
Where AI fails on security (and what I still do by hand)
Models are confident about auth. That is exactly why I do not let them own it.
AI-generated route handlers often look correct until you test IDOR: User A swaps an ID and lands in User B's project. Or the UI hides an admin button but the API still answers yes. Or a webhook handler updates subscription state without checking the signature. The code compiles. The demo works. Production is where it breaks.
I use AI to draft test ideas, threat-model notes, and checklist gaps. I do not use it as the final reviewer for anything that touches authentication, authorization, billing, file access, or tenant boundaries. Those get a manual pass with two test users and two organizations, the same way I describe in the SaaS MVP go-live guide.
Principles I do not outsource:
- Access control on the server, every time a record is read or mutated by ID.
- No trust in client-supplied identity fields (org, role, plan, price).
- Schema validation on bodies, params, and webhooks.
- Safe errors that do not leak stack traces, SQL, or internal paths.
- Signed, idempotent webhooks for payments and lifecycle events.
If a model generated the middleware, I still read the middleware.
Rate limits and legal pages: AI will skip both
Ask a model for a "production-ready SaaS" and you will get auth, a database, and a Stripe button. You probably will not get login throttling, export caps, a sensible privacy policy, or terms that match your refund rules.
Rate limiting needs product judgment. Which routes are expensive? What should happen when someone scripts your signup form at 3am? Should limits track IP, account, or organization? AI can suggest numbers. You decide what failure looks like when the limiter is down.
Legal pages need accuracy more than polish. A model can draft privacy policy boilerplate in seconds. It cannot know your subprocessors, retention rules, or whether you train on customer uploads. Templates are a starting point. Compare them to what the app actually collects and what your payment provider requires. Link policies from signup, footer, and checkout before you charge.
Go-live items I still verify manually
Abuse and access
- Login, signup, and password reset are throttled beyond IP-only rules
- Upload, export, and heavy routes have tighter limits than normal reads
- Two test users cannot access each other's data by changing IDs
- Canceled or unpaid accounts cannot call paid APIs from the client alone
Legal and trust
- Privacy policy reflects real data collection and third-party tools
- Terms and refund language match billing behavior
- Cookie notice matches installed analytics or marketing scripts
- Support contact is visible and monitored
UI copy: small words, big difference
Operational SaaS lives or dies on clarity. Hierarchy, status, empty states, error text. Models can draft these. You still need to read them out loud like a tired user at midnight.
Copy I have rewritten after AI drafts
Weak · Empty state
“Welcome to your dashboard”
Strong · Empty state
“No approvals yet. Upload a deliverable to start.”
Weak · Error
“Something went wrong”
Strong · Error
“Payment failed. Update your card to restore access.”
Weak · CTA
“Get started”
Strong · CTA
“Create your first review link”
SEO content without the hollow middle
AI makes it easy to publish ten articles in a day. Google and humans both notice when nothing new was said.
I use models for outlines, keyword clustering, and turning support themes into briefs. The paragraphs come from what we built, what broke, and what customers argued about in calls. If I cannot point to a real decision or mistake, the piece stays in drafts.
How I edit before publish
- Name the reader and the question they typed into Google.
- Outline first. Argue with the outline before writing body text.
- Add inputs only I have: screenshots, metrics, tradeoffs we considered.
- Delete any line that fits a thousand other SaaS posts.
- Add something usable: checklist, diagram, prompt that worked.
- Read aloud. Fix the stiff bits.
Quality pass before anything customer-facing
Quality gates before shipping AI-assisted work
Does this solve one specific workflow? Are permissions enforced on the server? Do rate limits exist on routes someone will abuse? Do privacy and terms match what we actually ship? Would I stake my name on this article?
Where AI earns its keep (and where it does not)
Worth the time vs waste of time
- Messy notes → structured requirements
- Schema and test drafts for human review
- Support ticket themes by topic
- SEO briefs from real customer questions
- Empty states and onboarding copy drafts
- Publishing unedited AI blog posts
- Whole products from vague prompts
- Copying competitor feature matrices
- Synthetic personas instead of calls
- Shipping auth, billing, or legal copy without human review
Judgment got more valuable, not less
Software is cheaper to start. That does not make taste free. The products I am proud of, like Zesty and Soma, are narrow on purpose. They do one job cleanly for people who do that job every day.
If you are building something similar and want a second pair of eyes on scope or architecture, get in touch. Happy to talk through what is worth building first.