A pipeline that bottles everything we've learned across your games, one-shots a POC, then sends other agents to playtest & verify it — and only ships when it clears the bar we've already set.
Each stage is a level. Each level has an agent that owns it, a job, and a gate it has to pass before the next unlocks — same as a real game.
window.__t rAF chains, served-Playwright screenshots. Runs scripted playthroughs — economy, progression, state machine, structure — and snaps every screen.Not invented — pulled straight from the brain. These are the boss requirements every POC has to clear.
Agents can prove a game's logic (the harness) and its structure (screenshots) — but headless WebGL is unreliable and a screenshot can't tell you if it's fun. So the factory's honest job isn't "make fun automatically." It's to get every POC to a structurally-perfect, on-standard state cheaply — so your taste is spent only on the bit that's actually yours: the feel.
Business-Don order: prove the riskiest assumption for the least money, before building the whole machine.
Riskiest assumption = "an agent can one-shot a POC that passes the structural gate." Hand one agent the spec + one simple genre, one-shot it, run it through the existing harness + screenshot-judge. If it can't clear the bar once, the factory's moot.
cheapest test of the riskiest betDistill the brain into a concrete checklist a judge can score. Small and sharp beats big and woolly.
Bundle the pieces you already have — Node DOM-stub + served-Playwright screenshot + the LLM-judge prompt — into one reusable module.
Fail → notes → regenerate → re-judge, until clean. Prove the loop on the genre you've already nailed (survivor / tunnel-run lineage).
Fan out across genres on free coins; each ship writes its learnings home so the estate compounds.
Never automate the fun check. The factory delivers POCs to your bar; you decide what's actually good.