How I built a 13-case-study portfolio with Claude — without writing the case studies

There is a specific feeling that designers get on the third round of handoff feedback. The PM has changed the copy. The engineer has interpreted the spacing wrong. You re-open Figma, you nudge the elements four pixels, you re-export, you write three Slack messages. Three days have passed. Nothing has shipped. The design has not improved. The product has not improved. You have produced exactly nothing that a customer will ever see.

I had this feeling for ten years. In 2024, I decided to stop having it.

This post is about what happened next — building a 13-case-study portfolio site (the one you're reading this on) end-to-end with Claude Code as the engineering partner. I wrote almost no code by hand. I also wrote almost no case studies by hand. The thing that did take effort was something that didn't exist when I started: skill files.

Skill files turn out to be the actual leverage. Not the AI. Not the framework. The encoded judgment. This post is about how to build them.

What this post is, and what it isn't

This isn't a tutorial. There are enough "build a Next.js site with AI" tutorials. This is a structural report on a specific project — what shipped, what didn't, what cost more than I expected, and what an engineering-aware designer can actually pull off in 2026 with the current Claude tools.

If you're a designer who can't yet code: the answer is yes, you can ship this, and the pitfalls section near the end is the part you'll want to read twice.

If you're an engineer who thinks designers can't ship without you: the case study count is 13. The hand-written code is roughly zero. The deploy is automated. Reflect on what your competitive moat actually is.

If you're hiring: scroll to the closing.

The decision

I had three options in late 2024.

Option one — keep designing in Figma, keep handing off, keep waiting. The familiar trap. Predictable cost, predictable output, predictable feeling that none of this is mine.

Option two — hire a freelance engineer to build the portfolio site. Fast, but the case studies would be locked into whatever the engineer interpreted from my designs. And the moment I wanted to add a new case study, I would be back in the handoff loop.

Option three — learn to ship code myself, with an AI partner that can fill in the parts I don't know. The site is mine. The case study format is mine. Every new case study is a file I write, not a project I commission.

I chose option three. The cost was real — I had to learn TypeScript properly, I had to learn React composition patterns, I had to learn how Next.js routing actually works. But the leverage was permanent.

The tech stack — and the boring reasoning behind every pick

People ask about the stack first, so let's get it out of the way:

Next.js 16 with webpack (not Turbopack — I'll explain)
React 19
Tailwind CSS 4
MDX for case study and blog content
Magic UI + shadcn/ui for primitives
Vercel for deploy
Claude Code for engineering, Cursor for occasional manual edits

The interesting picks are not the obvious ones.

Why webpack and not Turbopack

Turbopack is faster. Everyone knows. I still chose webpack. Two reasons that took me a day each to debug:

Turbopack 16.2 has a CSS bug with arbitrary-value Tailwind classes like bg-[linear-gradient(40deg,var(--x),var(--y))]. During PostCSS transform it emits Unicode replacement characters into the compiled CSS. Webpack does not. I lost an afternoon to a 500 error before I figured this out.
iCloud Drive plus atomic renames. This project lives under ~/Documents/, which iCloud syncs. Turbopack writes its build manifest using atomic-rename operations that iCloud's file provider intercepts mid-write. Dev server won't start. Webpack uses a different write strategy and ignores iCloud.

I recorded both in AGENTS.md so Claude Code never tries to switch the dev script back to Turbopack on autopilot. This is the kind of thing skill files are for.

Why MDX

Every case study has a frontmatter file in content/work/<slug>.mdx and a section component file in components/mdx/<slug>-sections.tsx. The MDX body is just a list of section tags:

<LeucineHero />
<LeucineBackground />
<LeucineProblem />
<LeucineDecision />
<LeucineBuild />
<LeucineResults />
<LeucineDeepDive />
<LeucineClosing />

The actual content — words, layout, hover states, animations — lives in the React components. This split matters because:

Frontmatter is what the work index, the home page Featured Work, and the navigation arrows need. It's structured data: title, role, duration, metrics, cover image, year. I never have to crack open a 1,200-line component to read those.
Section components are where the typography, the visual chapters, and the Magic UI animations live. Each case study gets to look slightly different inside this contract.

Adding a new case study is roughly: copy the closest existing pair, fill in the new content, register the section exports in the route's component map. That last step is mechanical enough that the case-study skill describes it in a single paragraph.

Why Magic UI, sparingly

Magic UI is gorgeous. It's also a trap. If you import every primitive into every case study, the site reads like a CSS demo, not a portfolio. The rule I landed on, written into the case-study skill, is:

Magic UI primitives only earn their weight when the content is the primitive. A NumberTicker belongs on a Results metric grid where the number is the proof. A Terminal animation belongs on a chapter where the engineer prompting Claude is the chapter's argument. An AnimatedBeam belongs on a handoff-loop diagram where the loop is the failure mode. Anywhere else, they are decoration.

By case study eight I was deleting more Magic UI than I was adding.

The actual leverage: skill files

This is the part that took me a year to understand.

When I started building case studies with Claude, every one had inconsistent widths, inconsistent eyebrow colors, inconsistent voice. I would fix the latest one. The next one would have new versions of the same problems. I'd fix those. The one after would regress on the ones I'd just fixed. Claude was not the issue. I was the issue. I was carrying the rules in my head and surfacing them inconsistently across sessions.

The fix was a skill file: a Markdown document, sitting at .claude/skills/case-study/SKILL.md, that defines every rule that matters. Width canon (780px hero anchor, 680px section headline, body unconstrained). Eyebrow color (emerald #34d399, only on section/chapter eyebrows, never on sub-cell labels). Voice rules (no fabricated punchlines, no "we" when "I" is the truth, fact-based always). Section order. Visual chapter pattern. When to use HighlightCard and when not to.

The skill file is loaded into context whenever Claude works on a case study. Not because I remember to load it — because the skill activation rules pull it in automatically when the file paths involve content/work/** or components/mdx/**.

There is a second skill file, portfolio-interview-audit, that does something different. It treats one case study as if it were being read by five different interviewers — Recruiter, Hiring Manager, Design Lead, XFN (PM + Eng), CPO/CEO — and surfaces the questions a real interview loop would ask, plus a fix list. I run it before every interview to make the page hostile-proof.

These are not prompts. Prompts are ephemeral. Skills are durable. They live in the repo, version with the code, evolve as I find new pitfalls, and apply automatically to every future case study without me having to remember anything.

If you take one thing from this post: the leverage is not the model. The leverage is the encoded judgment.

Subagent-driven development — the pattern I actually use

When I rewrite a case study, I don't do it as a single conversation. The conversation gets too long, context gets diluted, and I end up with a 4,000-line MDX file that nobody — including Claude — can hold in their head.

I do it through subagents. The pattern is in another skill file, superpowers:subagent-driven-development, and the cycle goes like this:

Plan. I write a structured implementation plan as a Markdown file with 16–20 numbered tasks, each task is bite-sized: "create this file with this exact code, run this exact command, commit with this exact message." Each task is something a fresh subagent could do in 5–10 minutes without prior context.
Per-task loop. I dispatch a fresh implementer subagent with the full task text. It writes the code, runs the typecheck, commits. Then a spec-compliance reviewer subagent reads the implementation against the plan. Then a code-quality reviewer subagent reads it for craft. Issues get fixed. Re-reviewed. Then I move on.
Final review. After all tasks land, a holistic reviewer reads the whole thing through the case-study skill canon and flags anything that drifted.

The Leucine Design System case study went from "approve the spec" to "shipped to main" in 19 atomic commits, executed by subagents. The IU Invoice case study did the same in 16. I did not write a single line of TypeScript by hand for either of them.

The cost of this pattern is more model calls. The benefit is that fresh context per task means each subagent stays focused, the work doesn't drift, and when something goes wrong — and it does — the failure is isolated to one task and easy to fix without redoing the whole case study.

What earns its place — Magic UI primitives I actually use

After 13 case studies, here's what survived the editing pass.

NumberTicker

On Results sections only. The number is the proof — 51 customers, 370 plants, 92% reduction, 5 days, 7 hours. The animation lasts about 1.5 seconds and lands once on scroll. Anywhere else, a static number is faster to read and just as informative. I removed NumberTicker from three places early on.

Terminal + AnimatedSpan + TypingAnimation

I use this exactly twice across the whole portfolio. Once in the IU Invoice deep dive, where the chapter is literally about Claude generating a working invoice tool from a prompt — and the Terminal is the literal artifact of that. Once in the Leucine Design System case study, where the methodology punchline is "the design system became the designer" — and the Terminal animation shows an engineer prompting, Claude reading the skill file, and a settings page rendering. In both cases, the primitive is the argument the chapter is making. Anywhere else it would be flair.

AnimatedBeam

Once. The IU Invoice handoff-loop diagram, where the loop is the failure mode being described. Tried it in two other places and removed both — they read as motion for motion's sake.

TextGenerateEffect

On Closing sections. Specifically the headline that wraps up the case study's thesis. The reveal is short, the words are the kicker, and the whole effect feels like the page taking a breath before it ends. Used in 9 of 13 case studies. Removed from the four where the closing was a list rather than a sentence.

What I removed

Background gradients, particle systems, glowing borders, animated accent lines on hover. Every one of these felt sophisticated when I added it and felt like noise three case studies later. The site is faster and more legible without them.

The width canon — the rule that took me five case studies to internalize

CLEEN was the first case study I wrote. MES followed. I copied CLEEN's structure into MES because consistency. By case study three, I was eyeballing widths and tweaking them per case study. By case study six, the IU Invoice rewrite had 31 different max-w-* values across one file. The user — me — would scroll the site and notice that some headlines wrapped at 580px, some at 640px, some at 680px, some at 780px, some had no max-width at all. It looked like five different designers had touched it.

The fix was a single rule, written into the case-study skill, that I now treat as inviolable:

| Element | Width | |---|---| | Hero anchor sentence | max-w-[780px] | | Section headline (h2 in eyebrow + headline + body pattern) | max-w-[680px] | | Chapter headline (Deep Dive Layer 3) | max-w-[680px] (or wide prop to allow single-line headlines) | | HighlightCard body | max-w-[680px] | | Closing TextGenerateEffect words | max-w-[680px] | | Body paragraphs | unconstrained — let them flow to the article container (max-w-5xl 2xl:max-w-[1200px]) | | ScreenImage caption | max-w-[680px] (handled inside the helper) |

That's the entire rule. Two widths. The body is unconstrained. Everything else is one of the two. After I wrote this down, the IU Invoice rewrite went from 31 max-widths to 21 — and the 21 are all 780 or 680.

Automations that earn their keep

A few utility patterns that took five minutes to set up and pay back forever.

Image conversion with sips

Every case study has a cover image plus 5–15 inline screenshots. I capture them at full resolution (often 2400px wide), then I batch-convert to JPEG at quality 85, max-dim 2400. macOS ships with sips, which does this in one line:

sips -s format jpeg -s formatOptions 85 -Z 2400 source.png --out dest.jpg

When I migrated the Leucine Design System covers, that was a 7-line shell script in the implementation plan. The implementer subagent ran it, verified the output sizes, and committed. I never touched a file.

Headless Chrome screenshots

When I rewrote the Leucine Design System case study, the source was the live docs site at leucine-design-system.vercel.app. I needed 21 screenshots of different pages. Headless Chrome does this in one command per page:

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
  --headless \
  --window-size=1440,900 \
  --virtual-time-budget=6000 \
  --screenshot=cover.png \
  https://leucine-design-system.vercel.app/

The --virtual-time-budget=6000 is the magic flag — it gives the page 6 seconds of simulated time to fire animations and lazy-load images before the screenshot fires. Without it, you get half-loaded covers.

Hooks

The repo has session hooks that gate the dev server start, run the typecheck after every Claude commit, and prevent the wrong package manager from ever being invoked. None of these are visible in the UI. All of them prevent specific failures I have personally lived through.

The honest pitfalls — what cost me time

This is the part of the post that nobody else writes. Here are the four mistakes that ate the most hours.

1. Fabricated quotes

Early Claude sessions wanted to write punchy "highlight cards" in my voice. Lines like "5 days isn't the headline. The headline: I knew exactly what to build because I'd lived inside the failure modes for 4 years." They sound like me. I did not say them. After I shipped the Leucine Design System rewrite, my own honest re-read found nine of these scattered through the deep-dive chapters. Every one had to come out, because case studies on this portfolio are fact-based, and a recruiter who quotes one back at me deserves to hear "I actually said that" — not a hedge.

The fix was a rule in the case-study skill that's now the loudest sentence in the file: no fabricated punchlines, no put-words-in-the-author's-mouth highlight cards, ever. If a HighlightCard is going to ship, the words inside it must be words I actually said in the source material I gave Claude.

2. Width inconsistency

Already covered. The reason it took five case studies to fix was that each violation was small, and I kept fixing them locally instead of writing the rule down. The minute the rule lived in the skill file, it stopped happening.

3. Cover aspect ratios

The Leucine Design System cover was a full-page screenshot at 1920×2400 — portrait. Inside the HeroCover container, which uses w-full h-auto, the green padded container stretched vertically to match the image. The hero looked narrow next to CLEEN and MES, which both have 1.6 aspect-ratio landscape covers. The fix was to re-capture at 1440×900 (landscape) using headless Chrome. Five minutes once I knew the diagnosis. An hour of "why does this look weird" before that.

4. Component placement in Layer 2 vs Layer 3

The case-study skill defines a three-layer narrative model — hero (Layer 1), scrollable body (Layer 2), and Deep Dive (Layer 3). The portfolio-interview-audit skill flags content that is buried in Layer 3 when the persona will only read Layer 2. The most common audit finding across my case studies is "great evidence, wrong layer." Recruiter never gets to chapter 6. Hiring manager rarely makes it past the Build section.

The fix is a one-line teaser in Layer 2 with a "→ see chapter N" anchor to the deep version. The instinct is to surface every honest detail at every layer. Don't. Layer the depth.

The cost — and why model selection matters

People ask what this costs in API spend. The answer depends entirely on which model you pick for which task.

The Anthropic prompt cache has a five-minute TTL. Every time you sleep past 300 seconds without re-using the cached prompt, your next call reads the full conversation context fresh. That's slower and more expensive. So when I'm orchestrating a 19-task subagent loop, I pick model size by task complexity:

Mechanical tasks (asset migration, frontmatter rewrites, lint cleanup, simple edits) → Claude Haiku 4.5. Fast, cheap, perfectly capable for tasks where the spec is exact and the operation is one-shot.
Integration tasks (fill in a section, append a chapter, refactor an import) → Claude Sonnet 4.6. Standard quality, standard cost, handles multi-file coordination without drifting.
Architecture and creative tasks (write the blog post, design the spec, do the holistic review, audit the portfolio) → Claude Opus 4.7. Most capable, most expensive — but you only call it when judgment is the bottleneck.

The Leucine Design System rewrite (19 tasks) cost roughly $4 in API spend, because the heavy creative work was in the spec (Opus, one call) and most implementer subagents were Haiku. The whole portfolio — 13 case studies, the home page rewrite, the about page rewrite, this blog post — has cost less than a single freelance engineer afternoon, total.

The cost of the skill files is one-time. The cost of every new case study after that is roughly an hour of my time and a few dollars of compute.

What this means if you're a designer

You can ship. Your taste is the constraint. Your judgment is the constraint. The thing that used to require a freelance engineer or a four-week handoff cycle is now a Claude session and a skill file.

This is not the same as "anyone can do this." You still have to:

Know what good looks like, because Claude will produce a competent average if you don't push it
Know what's wrong with what shipped, because the audit is on you, not the model
Write the rules down, because the model can't read your mind
Use the right tool for the right task, because Opus on a frontmatter rewrite is wasted spend

But the gap between "I have a vision for a portfolio" and "the portfolio is on the internet" used to be measured in months. It's now measured in days. For me, this entire portfolio — the 13 case studies, the design system, the home page, this blog post — has been roughly six months of evening and weekend work, with the heavy lifting compressed into the last two months once I figured out the skill-file pattern.

What this means if you're hiring

The portfolio you're reading is the artifact of someone who can:

Hold a six-year design narrative across four enterprise products and ship a working invoice tool in seven hours
Recognize when AI is the leverage and when it's the noise — the discipline of the deletion pass is the part most people skip
Write durable artifacts (skill files, plans, audits) that survive past the conversation that produced them
Operate the full stack, end-to-end, from spec to deploy, alone — without reducing the design quality to do it

If that sounds like the shape of role you're hiring for, the contact page is two clicks away. If the rest of this post made you nervous about whether your design org needs to evolve, it probably does.

What's next

I am writing more of these. Topics in the queue, in roughly the order I think I'll publish:

The case-study skill, in full — rules, examples, anti-examples, why each rule exists. A reference document, not a story.
Five interview-audit findings that recur across every case study — the patterns Recruiter and Hiring Manager always flag, and the structural fixes.
Building the Leucine Design System in five days — a more granular technical post about the system itself, not just how I built the case study about it.
The hooks that prevent specific failures — the three or four small automations that have saved me from regressions I'd otherwise live through.

If any of those interest you specifically, I'll prioritize them. The contact page works.