Architecture from workload: a seven-dimension skill that derives, instead of suggesting

Ask a coding agent "design infra for this Next.js app" and the answer you get is shaped by whichever stack the model last saw a lot of. Two prompts, two architectures. No way to argue your way to the right one — the reasoning isn't in the room.

I built to close that loop. It's a Claude Code skill (and a few other agents — more on that later) that turns infrastructure design into constraint solving. The agent reads the workload, runs each of seven dimension resolvers against the math behind them, matches the resulting constraints against a technology catalog, and composes a sized + costed architecture. Every number cites the formula that produced it.

Same workload in, same architecture out. The reasoning is the math, not the model's recent training.

What it does

Given application code or written requirements, the skill:

  1. Extracts a WorkloadDescriptor — the input contract, what the system must do, not how to build it. (See .)
  2. Runs each of seven independent dimension resolvers — TIME, SPACE, WORK, STATE, FAILURE, COST, TRUST — to produce concrete constraints.
  3. Matches constraints against a technology catalog (Lambda, Fargate, EC2, Cloud Run, RDS, Aurora, DynamoDB, Redis, S3, ALB, NLB, CloudFront, Cloudflare, SQS, EventBridge, Kafka, Kinesis, …).
  4. Checks proven patterns (request-response, cell, fan-out-on-write) for preconditions.
  5. Applies regulatory overlays as TRUST constraints — APPI, GDPR, PCI-DSS, FSA Japan, SOC 2, HIPAA.
  6. Composes a sized, costed architecture and outputs verification criteria — load-test parameters, chaos scenarios, security checks, cost monitors. Optionally compiles to Terraform / CDK / Pulumi.

The whole entry point is one file: . Everything else is references the agent reads on demand.

The seven dimensions

The seven dimensions

Independent axes, not layers. Every workload occupies a point; every technology covers a region.

TIME

latency, throughput, freshness, TTL

L = λ × W

Little's Law · Erlang-C · USL

SPACE

region, residency, edge vs core

RTT = d / (c · 0.67)

speed of light in fiber

WORK

compute shape, parallelism, profile

S(N) = 1 / (s + (1−s)/N)

Amdahl · USL · thread pools

STATE

consistency, durability, sharing

I = Q × B × (1 − h)

CAP · IOPS · replication mode

FAILURE

availability, blast radius, RPO/RTO

A = 1 − (1 − a)ᴺ

independent failures

COST

money, complexity, opportunity

min Σ cost ᵢ s.t. constraints

linear program over pricing

TRUST

access, encryption, compliance

classification + reg → controls

rule engine, strictest-wins

These are not layers. They are independent axes. Every workload occupies a point in this 7D space; every technology covers a region. The matcher finds technologies whose region contains the workload's point across all seven simultaneously.

The ten formulas the resolvers run on are the ones that don't change with cloud vendor, region, or year:

 1. Little's Law:        L = λ × W
 2. USL:                 C(N) = N / (1 + α(N−1) + β·N·(N−1))
 3. Amdahl's Law:        S(N) = 1 / (s + (1−s)/N)
 4. Availability:        A = 1  (1  a)^N      (independent failures)
 5. Speed of Light:      d = c × t / 2
 6. Erlang-C:            P(wait) = C(c,ρ) / (1  ρ + ρ·C(c,ρ))
 7. Cache Hit (Che):     h  1  (C/N)^(1−α)
 8. IOPS:                I = Q × B × (1  h)
 9. Bandwidth:           BW = λ × payload × overhead
10. Cost LP:             min Σ cost_i  subject to all constraints

The catalog and the prices change. The math doesn't.

A worked example

worked example

5K-DAU Next.js app, Tokyo, APPI-compliant

One workload, run through the resolver. Same input, same output every time — the math doesn't depend on which stack the architect last shipped.

step 1

WorkloadDescriptor

throughput_peak="120 rps"
latency_p99="300 ms"
consistency="session"
working_set_size="~2 GB"
availability_target=0.999
data_residency=["ap-northeast-1"]
data_classification="pii"
regulatory=["APPI"]
step 2

Dimension resolvers

TIME

Little's Law → 36 concurrent reqs at peak (120 × 0.3s). P99 < 5× service time → utilization target ≤ 85%.

FAILURE

99.9% target − 99.99% AZ leaves no margin for software defects. Multi-AZ recommended; cost is small at this scale.

COST

120 rps sits below the Lambda/Fargate crossover (~70 rps at 1 vCPU). Lambda ~$25/mo. Fargate (2× 1vCPU/2GB, 2-AZ) ~$90/mo.

TRUST

APPI → encryption at rest (CMK acceptable), access logs required. ap-northeast-1 deployment removes cross-border transfer obligation.

step 3

Composed architecture

Compute

Vercel (Next.js native) or Fargate 2× (1 vCPU / 2 GB)

Database

RDS PostgreSQL db.t4g.medium, Multi-AZ, 7-day PITR

Cache

In-process LRU. Add Redis only when read:write actually exceeds 10:1 in production.

Region

ap-northeast-1 (Tokyo)

Estimate

≈ $140/mo (Fargate variant); 70% is the database

Top lever

Drop to single-AZ → ≈ $80/mo. Trades infra availability for cost; document the tradeoff.

Verification: load test 200 rps with P99 < 300 ms · trigger AZ failover, recovery < 60 s · confirm pgaudit lands in CloudWatch · confirm bucket + DB region = ap-northeast-1.

That's the whole loop on a small workload. The interesting part isn't the answer; it's that two architects with different stack histories — one fluent in Lambda, one allergic to it — converge on the same recommendation when the matcher does the work. Disagree with any number and you change the input and re-derive. The skill cites which formula produced which value, so the disagreement converges quickly instead of turning into a slack thread.

For the full output the skill produces (including the cost arithmetic and the verification block), see the README's example section.

What's swappable, and what isn't

The shipped catalog leans AWS (16 of 19 entries — only Cloud Run, Cloudflare Workers, and Cloudflare CDN sit outside) and pricing is illustrative against ap-northeast-1 (Tokyo). The regulatory tree is most-developed for Japan (APPI, FSA Japan), with single-file coverage for the EU (GDPR), the US (HIPAA), and global frameworks (PCI-DSS, SOC 2). That's the bias of the operating context I built it for — not a limit of the framework.

The dimensions, formulas, and patterns are vendor- and jurisdiction-neutral. To run this on GCP, Azure, Cloudflare, Oracle, Alibaba, or on-prem, the move is to add catalog entries under references/catalog/<function>/ following the 7D structure of any existing entry (see as a template). The matcher consults whatever's in the catalog. To add a regulation — UK GDPR, NIS2, DORA, AI Act, CCPA, PDPA Singapore, LGPD Brazil — drop a file under references/regulatory/. The TRUST resolver merges them in via strictest-wins.

The structure is the durable thing. The catalog and the regulatory tree are inputs.

Agent-neutral by design

The content is plain Markdown. Only the install path differs per agent. The script auto-detects what's installed and links the skill into each:

AgentInstall location
Claude Code~/.claude/skills/architecture-skill
OpenAI Codex CLI~/.codex/skills/architecture-skill (+ AGENTS.md import)
OpenCode~/.config/opencode/skills/architecture-skill
Cursor~/.cursor/skills/architecture-skill
Factory~/.factory/skills/architecture-skill
Aider / Cline / Continue / Windsurfproject-level rule files

AGENTS.md at the repo root is the short, agent-neutral entry for any tool that follows the agents.md convention. The activation phrases live in SKILL.md's frontmatter; agents with skill auto-discovery (Claude Code, Codex CLI) match against those.

Why this shape

Architecture review boards already do this. You walk in with throughput, latency, residency, classification, and a budget. You leave with a recommendation that names the formula behind every number. The skill codifies that loop so a coding agent can run it locally, without a board, in a few seconds — and so two agents in two different orgs converge on the same answer for the same workload.

Two specific things I was trying to make true:

  1. Reasoning is portable. The catalog and the regulatory overlays are the parts an organization swaps in. The seven dimensions and ten formulas don't move. A team running this on GCP-in-Frankfurt-under-DORA gets the same kind of recommendation a team running it on AWS-in-Tokyo-under-APPI does — same math, different inputs.
  2. Disagreements become input edits. The biggest waste in infrastructure conversations is arguing about the answer when the answer hasn't been derived from anything. Citing the formula behind a number turns "I think Lambda is wrong here" into "your traffic is spikier than the descriptor — change throughput_peak and re-run."

That's the recruitment-shaped subtext, too: the skill is the artefact, but the durable claim is how I think infrastructure decisions should get made. Workload-shape first. Math before vendors. Regulatory overlays as inputs, not afterthoughts.

Caveats

The shipped catalog is a starting kit. AWS / Tokyo / APPI are the bias of where it was built. Pricing in the cost examples drifts every quarter — treat any dollar figure as approximate and recompute against current published pricing for production decisions. The validation set is three case studies (Netflix, Uber, Facebook); the durable test would be running the skill against a few dozen real production architectures and seeing how often it converges to what teams actually shipped. If you do that experiment, I'd love a PR with what you found.

Reproduce

git clone --single-branch --depth 1 \
  https://github.com/dominic-righthere/architecture-skill.git ~/architecture-skill
cd ~/architecture-skill && ./setup

Run ./setup --help for flags (target one agent, copy instead of symlink). Then ask any infra question in your agent of choice — the skill auto-activates on the trigger phrases in the frontmatter.

For per-project install (Cursor, Aider, Cline, Windsurf), the README has the equivalent path under each agent's project rules directory.


Related: a month of agentic delivery — the production context where decisions like "Lambda or Fargate, single-AZ or multi-AZ, RDS or DynamoDB" stop being interesting trivia and start being the reason something does or doesn't ship in a Japanese-enterprise IT cycle. The reflection paradox — a sibling build journal on agent design where the same "cite the formula behind the number" instinct showed up in a different domain.