How to Use AI Coding Agents Effectively

101

Model Agent Tools Workspace Checks

Know which layer you are changing before you blame the agent.

Models

Choose models by risk

Use stronger reasoning for ambiguous, unfamiliar, expensive work
Use faster models for cheap, reversible, mechanical work
Switch when the task changes

Surfaces

Choose the right surface

IDE: direct edits with tight human steering
CLI: repo, shell, and Atlassian context
Chat: explanation without a real feedback loop

Context

Give the agent safe reach

MCP: tools and data without copy-paste
Skills and memory: instructions that survive the chat
AI Gateway: model access for internal AI features

Workspace

Run where checks are real

Local: fastest when setup already works
Volt: useful when environment setup is the hard part
Worktrees: isolate parallel attempts

Many people use coding agents for a bit, get frustrated, and conclude "they're not good."

That's like picking up a guitar, fumbling through a few chords, and concluding guitars don't work.

The people getting the most out of agents aren't using better tools.

They built the skill to use them well.

Plan · Execute · Review · Verify

Different jobs. Different contexts. Different artifacts.

Planning

Create a planning artifact

For non-trivial work, write the plan before code
Use your team's format: OpenSpec, proposal, design doc, task list
Capture goal, scope, sequence, checks, risks, rollback point
Make vague thinking visible before the agent edits anything

Planning

The plan is the handoff

A fresh session should know what to do next
Record decisions, scope, and open questions
Record what proves each step worked
Keep the artifact current as reality changes

Planning

Interrogate the plan

Walk every branch before execution
What assumptions are you making?
What alternatives did you reject?
What edge cases are missing?

Planning

Chunk from the plan

Do one reviewable chunk at a time
Ship behavior through a real path
Verify the chunk before starting the next one
Keep the plan current

Planning

Give concrete context

README, tests, nearby code
The problem reproduced locally
Sibling implementations
Good repo examples when a pattern matters

Planning

Encode repeatable patterns

Some work repeats across many targets
Tracking table: what is done
Skill: how to do one target
The table tracks state. The skill preserves judgment.

Execution

Separate context by job

Keep the main session clean
Use subagents for narrow jobs
Review from fresh context
One concern per context

Execution

Do not optimize for token cost

A $0.50 prompt that gets it right is cheaper than five $0.05 prompts that each need fixing.

Use strong models when mistakes are expensive.

Execution

Do not fight polluted context

Compact before quality drops
Start fresh when the task changes
Bail when the agent repeats itself
Preserve clean context for judgment

Verification

Tell it how to verify

Do not just say "refactor this"
Ask for a check the agent can run
Make it run the check
Verification must be automated, deterministic, and runnable

Review

Separate writing from review

Do not make the writing session the only reviewer
Start a clean review session
Give it the diff, relevant files, and conventions
Ask for file, line, contract, and failing path

Review

Split review by job

Correctness
Failure modes
Security
Tests
Design

One reviewer, one lens.

Review

Functional is not secure

AI code can pass tests and still create attack surface
Add security as a separate review lens
Scan every PR
Treat auth, data access, and input handling as high-risk

Verification

A test that has never failed is a test you cannot trust.

A smoke detector that has never been near smoke tells you nothing about whether it works.

Verification

Mechanize invariants

Instructions guide behavior
Checks constrain behavior
Lint, type checks, tests, security scans, branch protection
If a rule can be checked mechanically, put it in the dev loop

Judgment

The human owns judgment

Decide what context matters
Name the behavior you want
Reject plausible nonsense
Protect design boundaries and domain language

Judgment

Slow down when verification is weak

Context spans multiple systems
Correctness depends on subtle behavior
The cost of failure is high
Tests do not cover the real risk

Read the full post

alexandersumer.com/blog/how-to-use-ai-coding-agents-effectively