gstack: Garry Tan Just Open-Sourced His Personal Claude Code Setup

By Prahlad Menon 2 min read

Updated March 2026: gstack has grown to 37,000+ stars and added /office-hours — the new first step in the pipeline. Coverage has reached Vox. The core skills and workflow below are unchanged.


Garry Tan ships code. That’s the first thing to know. The YC CEO has talked publicly about using Claude Code daily — and now he’s released the exact setup he uses.

gstack is a set of opinionated Claude Code skills that turn a solo developer into a virtual tech company. Not metaphorically — each skill literally switches Claude into a different role with a different job and a different lens on your work.

The problem it solves

Claude Code, used naively, takes your request literally. You say “add photo upload,” it adds photo upload. You say “review my PR,” you get inconsistent depth. You say “ship this,” you get a conversation about what to ship.

The agent is also half-blind — it can write code but can’t see your running app, can’t click through flows, can’t catch layout breaks. And you’re still doing QA by hand.

gstack fixes all of this with a set of slash commands — and a sprint pipeline where each step feeds directly into the next.

The full sprint pipeline

The key insight that most people miss: gstack is a pipeline, not a menu of tools. Each step reads what the previous step wrote. Nothing falls through because nothing is repeated from scratch.

/office-hours      → Challenge the idea. Write the design doc.
/plan-ceo-review   → Is this the right product to build?
/plan-eng-review   → Architecture, diagrams, edge cases, test matrix.
[implement]        → Write the actual code.
/review            → Paranoid staff engineer finds what CI misses.
/qa                → Browser clicks through your app, finds bugs.
/ship              → Sync, test, resolve reviews, push PR.

The design doc written by /office-hours is automatically read by /plan-ceo-review. The engineering plan from /plan-eng-review is read by /review. Every step has context from every step before it.

/office-hours — the step most people skip

This is the newest and most underrated skill. It runs before any planning or code — a hard gate that forces you to understand the problem before proposing solutions.

Two modes:

Startup mode — six forcing questions from a YC partner:

  • Demand reality: who is desperate for this today?
  • Status quo: what do they do without you?
  • Desperate specificity: the most specific possible customer
  • Narrowest wedge: what’s the smallest thing that proves demand?
  • Observation: what have you seen that others haven’t?
  • Future-fit: why is now the right time?

Builder mode — brainstorming for side projects, hackathons, open source. Collaborative, not interrogative.

Both modes end the same way: a design doc saved to disk. Hard gate: /office-hours produces a document, never code. The subsequent skills in the pipeline read that document automatically.

The thread going around on X this week described it well: /office-hours didn’t stop after six questions. It challenged the framing, identified the wrong problem, generated three implementation approaches with effort estimates, and wrote the design doc. That’s the skill working as designed.

The full skill table

SkillRoleWhat it actually does
/office-hoursYC partnerChallenges idea, 6 forcing questions, writes design doc
/plan-ceo-reviewFounder / CEOChallenges whether you’re building the right thing
/plan-eng-reviewEng managerArchitecture, data flow, diagrams, edge cases, test matrix
/reviewParanoid staff engineerFinds bugs that pass CI but blow up in production
/shipRelease engineerSync main, run tests, resolve reviews, push, open PR
/browseQA engineerLogs in, clicks your app, takes screenshots, catches breakage
/qaQA leadAnalyzes diff, finds affected pages, runs systematic tests
/setup-browser-cookiesSession managerImports real browser cookies for authenticated testing
/retroEng managerTeam retro with per-contributor praise and growth notes

The workflow in practice

The README includes a real example worth reading in full:

You: /office-hours
     I want to add seller photo upload to the listing app.

Claude: Who is desperate for photo upload today? What do they do without it?
        [6 questions, 3 implementation approaches, effort estimates]
        → Saves design doc to .gstack/design-docs/photo-upload.md

You: /plan-ceo-review

Claude: "Photo upload" is not the feature. The real job is helping sellers
        create listings that actually sell. Here's the 10-star version:
        auto-identify the product from the photo, pull specs and pricing
        comps from the web, draft the title and description, suggest the
        best hero image...

You: /plan-eng-review

Claude: [Architecture diagram, state machine for upload → classify → enrich
        → draft pipeline, async job boundaries, failure modes, test matrix]

You: [implement the plan]

You: /review

Claude: Race condition: two tabs can overwrite cover-photo selection.
        Trust boundary: pulling web data into draft generation creates
        a prompt injection vector. Missing: orphan cleanup for failed uploads.

You: [fix issues]

You: /ship

The browser skills are underrated

/browse and /qa are the ones most people will overlook but shouldn’t. They give Claude actual vision into your running app — a headless browser that navigates, clicks, screenshots, and reports. /setup-browser-cookies imports your real session from Chrome, Arc, Brave, or Edge so it can test authenticated flows without manual login.

Full QA pass in ~60 seconds. That’s not a benchmark — that’s just clicking through your app faster than you can.

Honest take

The /office-hours forcing function is the most valuable piece in the full pipeline. The failure mode for AI-assisted coding isn’t bad code — it’s building the wrong thing efficiently. Having a structured “understand the problem before touching the keyboard” step baked in as a slash command is the right idea. Most developers skip it. A command that makes skipping it feel unnatural is worth more than any code review tool.

The rest — review, ship, QA — are solid quality-of-life upgrades. The pipeline structure (each step reading what came before) is what makes them work as a system rather than isolated tools.

Most people will bookmark this. Almost nobody will install it. The ones who do will build differently.