Harness Engineering
The practice of designing the constraints, documentation structures, linters, feedback loops, and architectural rules that keep AI-generated code coherent and maintainable over time. The harness is everything the agent operates inside — not the code it writes, but the environment that governs how it writes.
Full treatment: AI Didn't Remove Engineering Judgment
AGENTS.md
A Markdown file placed at the root of a repository that gives AI coding agents persistent instructions about the codebase, coding conventions, architecture decisions, and how to work within the project. It is the primary entry point for encoding engineering judgment into the agent's operating environment.
Full treatment: AGENTS.md: The Complete Field Guide
Taste Invariants
Rules that encode what good code looks like — naming conventions, logging format, file size limits, module boundary rules, error handling patterns — written into automated linters that run on every line of agent-generated code. They are called 'invariants' because they must hold across the entire codebase, regardless of which engineer or agent wrote the code.
Full treatment: Harness Engineering Checklist
Knowledge Architecture
The discipline of deciding what an AI coding agent is allowed to know, how that knowledge is organised, how it is accessed, and how it stays accurate over time. Distinct from context engineering (what is loaded per request) — knowledge architecture is the persistent repository of truth the agent draws from.
Full treatment: Knowledge Architecture for AI Agents
AI Coding Entropy
The tendency of AI-generated codebases to move from order toward disorder without explicit counter-pressure. AI agents learn from existing code patterns — including bad ones — and replicate them at scale. Without a harness, entropy compounds faster in AI-assisted codebases than in human-written ones because there is no natural judgment filter slowing the spread of inconsistent patterns.
Full treatment: AI Coding Entropy: What It Is and How to Stop It
Garbage Collection (Harness)
A continuous, automated process in which a background agent scans the codebase on a regular schedule, finds violations of documented conventions, and opens targeted refactoring pull requests. Each PR addresses one instance of one violation and should be reviewable in under a minute. Harness garbage collection replaces periodic 'cleanup sprint' sessions.
Full treatment: AI Coding Entropy guide
Acceptance Criteria as System Input
The practice of specifying task requirements precisely enough that an AI agent can reproduce a bug, validate a fix, open a pull request, respond to review feedback, and merge — without human intervention at any step. Writing criteria that an agent can act on is a harder skill than writing criteria for a human because it eliminates all ambiguity.
Full treatment: AI Didn't Remove Engineering Judgment
The Harness Flywheel
The compounding loop in harness engineering: a human encodes a judgment rule once → the agent applies it everywhere instantly → automated garbage collection catches any drift → the rule scales across the entire codebase with zero marginal cost per application. Each new rule added to the harness compounds with all previous rules.
Full treatment: AI Didn't Remove Engineering Judgment
Progressive Disclosure
A context management technique for AI agents that loads information in layers: L1 (always loaded — core identity, tool names, safety rules, under 500 tokens), L2 (loaded when relevant — full instructions for the specific task or tool), L3 (retrieved on demand — reference material, documentation, examples). Progressive disclosure reduces average tokens per agent run by 30–60% by loading only what the current task needs.
Full treatment: Guide to Cutting AI Agent Token Costs
Skills (AI Agents)
Reusable operating procedures for AI agents that package together: trigger conditions, required context, step-by-step process, decision checkpoints, and quality criteria. Skills replace mega-prompts by loading targeted instructions only when a specific task type is triggered, rather than loading all possible instructions on every agent call.
Full treatment: Skills Make Judgement Reusable
Prompt Engineering
The discipline of shaping the instructions sent to a language model for a specific request — phrasing, format specification, few-shot examples, chain-of-thought guidance, and role assignment. Prompt engineering operates at the per-request level and is the innermost layer of AI system design.
Full treatment: Harness vs Prompt vs Context Engineering
Context Engineering
The discipline of deciding what information fills the model's context window for a specific request — retrieval, compaction, progressive disclosure, tool result formatting, and conversation history management. Context engineering operates at the per-turn level and determines what the model can see when it generates a response.
Full treatment: Harness vs Prompt vs Context Engineering
Context Window
The maximum amount of text (measured in tokens) that a language model can process in a single request, including the system prompt, conversation history, retrieved documents, tool definitions, and tool results. Context window management is central to both token cost control and agent output quality.
Full treatment: Guide to Cutting AI Agent Token Costs
Tokenomics (AI Agents)
The economics of token consumption in production AI agent systems: the relationship between tokens consumed, value delivered per token, and cost per unit of work. Token value per watt per user is the emerging metric for evaluating whether an agent's output justifies its resource consumption.
Full treatment: Agent Loops, Tokenomics, and the Harness
Prompt Caching
A feature offered by major LLM providers (Anthropic, OpenAI, Google) that bills repeated identical prompt prefixes at a fraction of the standard token rate — typically 10–25% of normal input cost. Effective prompt caching requires placing static content (system prompt, tool definitions) at the start of the prompt, before any dynamic content.
Full treatment: Guide to Cutting AI Agent Token Costs
Model Routing
The practice of directing different task types to different model tiers based on their complexity and cost requirements. Simple routing or classification tasks go to small/fast models; standard execution goes to mid-tier models; complex reasoning requiring frontier capability goes to the most capable model. Model routing can reduce average cost per run by 50–80%.
Full treatment: Guide to Cutting AI Agent Token Costs
AI Slop
Code generated by AI agents that is technically correct (compiles, passes tests, delivers functionality) but introduces subtle quality degradation — unnecessary abstractions, verbose patterns, duplicate logic that doesn't warrant a shared abstraction, inconsistent conventions. AI slop passes code review because it looks intentional, but accumulates as a maintenance burden over time.
Full treatment: AI Coding Entropy guide
Agentic Loop
The execution cycle of an autonomous AI agent: perceive state → plan action → call tools → observe results → update understanding → repeat. The loop is what transforms a language model from a question-answering system into a system capable of multi-step autonomous work. The harness governs the loop — setting boundaries, requiring approvals, and enforcing policies on what actions the loop can take.
Full treatment: Agent Loops, Tokenomics, and the Harness
Context Compaction
The process of summarising or restructuring earlier conversation turns in a multi-turn agent session to reduce token consumption while preserving relevant state. Without compaction, context grows linearly with session length; with compaction, older turns are replaced by compressed representations that cost a fraction of the original tokens.
Full treatment: Guide to Cutting AI Agent Token Costs
MCP (Model Context Protocol)
An open protocol developed by Anthropic that standardises how AI assistants connect to external tools, data sources, and services. MCP enables AI agents in tools like Claude Code and Cursor to query production systems, retrieve context, and trigger governed actions — using the same capabilities and permission model as a human operator in the same system.
Full treatment: Agents, Context, and Guardrails