How to Use AI Coding Agents Effectively

Many people use coding agents for a bit, get frustrated, and conclude "they're not good."

That's like picking up a guitar, fumbling through a few chords, and concluding guitars don't work.

The people getting the most out of agents aren't using better tools.

They built the skill to use them well.

Plan · Execute · Review · Verify

Different jobs. Different contexts. Different artifacts.

101
Model Agent Tools Workspace Checks

Know which layer you are changing before you blame the agent.

Models

Choose models by risk

  • Use stronger reasoning for ambiguous, unfamiliar, expensive work
  • Use faster models for cheap, reversible, mechanical work
  • Switch when the task changes
Surfaces

Choose the right surface

  • IDE: precise local edits with tight human steering
  • CLI: repo-wide changes, tests, scripts, and automation
  • Chat: planning and explanation when no workspace is needed
Context

Give the agent safe reach

  • MCP: tools and data without copy-paste
  • Search quality: prefer trusted pages over noisy meeting notes
  • Skills and memory: instructions that survive the chat
Workspace

Build real feedback loops

  • Local: spin up the real path and reproduce quickly
  • CI: verify the same checks before integration
  • Prod: use logs, metrics, and traces to confirm reality
Planning

Create a planning artifact

  • For non-trivial work, write the plan before code
  • Capture goal, scope, sequence, checks, risks, rollback point
  • Record what proves each step worked
  • Make it readable by a fresh session or teammate
Planning

Red-team the plan

  • Walk every branch before execution
  • What assumptions are you making?
  • What alternatives did you reject?
  • What edge cases are missing?
Planning

Chunk by steel thread

  • Don't chunk by layer
  • Database → API → UI creates an integration cliff
  • Pick one thin path through the real system
  • Verify it before widening it
Planning

A steel thread is agent-shaped

  • Migration: move one operation, leave the rest alone
  • Greenfield: one narrow happy path that persists
  • Review: one behavior, one diff, one risk surface
  • Verify: one real path the agent can run
Planning

Give context the agent can use

  • Symptom: failing command, error, screenshot, or log
  • Path: entry point, caller, state, and test seam
  • Pattern: one or two nearby implementations to copy
  • Bounds: what must not change, and what proves success
Example

Turn a vague ask into a steel thread

  • Bad: "Refactor auth"
  • Plan: inspect one route, policy check, and test seam
  • Build: change one behavior behind one reviewed diff
  • Prove: run the failing test, auth test, lint, and fresh review
Planning

Encode repeatable patterns

  • Some work repeats across many targets
  • Tracking table: what is done
  • Skill: how to do one target
  • The table tracks state. The skill preserves judgment.
Execution

Delegate narrow investigations

  • Use subagents for read-only reconnaissance
  • Ask for files, flows, seams, and risks
  • Keep implementation decisions in the main session
  • Treat agent output as evidence, not authority
Execution

Optimize for engineer time

The expensive part is not tokens. It is debugging, review, rework, and waiting.

Weak models can “solve” problems by deleting code you needed.

Execution

Do not fight polluted context

  • Compact before quality drops
  • Start fresh when the task changes
  • Bail when the agent repeats itself
  • Preserve clean context for judgment
Verification

Define done as evidence

  • Name the command, scenario, or dashboard
  • Require the observed result, not "it should work"
  • Prefer checks the agent can rerun
  • Escalate when proof is manual or indirect
Review

Separate writing from review

  • Do not make the writing session the only reviewer
  • Start a clean review session
  • Give it the diff, relevant files, and conventions
  • Ask for file, line, contract, and failing path
Review

Split review by job

  • Correctness
  • Failure modes
  • Security
  • Tests
  • Design

One reviewer, one lens.

Review

Passing tests does not prove secure

  • Working code can still create attack surface
  • Add security as a separate review lens
  • Scan every PR
  • Treat auth, data access, and input handling as high-risk
Verification

A test that has never failed is a test you cannot trust.

A smoke detector that has never been near smoke tells you nothing about whether it works.

Verification

Mechanize invariants

  • Instructions guide judgment
  • Checks enforce invariants
  • Lint, type checks, tests, security scans, branch protection
  • If a rule can be checked mechanically, put it in the dev loop
Judgment

The human owns judgment

  • Decide what context matters
  • Name the behavior you want
  • Reject plausible nonsense
  • Protect design boundaries and domain language
Judgment

Slow down when verification is weak

  • Context spans multiple systems
  • Correctness depends on subtle behavior
  • The cost of failure is high
  • Tests do not cover the real risk

The agent does the typing. The human owns judgment.

Verification keeps both honest.

alexandersumer.com/blog/how-to-use-ai-coding-agents-effectively
Use to navigate