How to Use AI Coding Agents Effectively

101
Model Agent Tools Workspace Checks

Know which layer you are changing before you blame the agent.

Models

Choose models by risk

  • Use stronger reasoning for ambiguous, unfamiliar, expensive work
  • Use faster models for cheap, reversible, mechanical work
  • Switch when the task changes
Surfaces

Choose the right surface

  • IDE: direct edits with tight human steering
  • CLI: repo, shell, and Atlassian context
  • Chat: explanation without a real feedback loop
Context

Give the agent safe reach

  • MCP: tools and data without copy-paste
  • Skills and memory: instructions that survive the chat
  • AI Gateway: model access for internal AI features
Workspace

Run where checks are real

  • Local: fastest when setup already works
  • Volt: useful when environment setup is the hard part
  • Worktrees: isolate parallel attempts
Many people use coding agents for a bit, get frustrated, and conclude "they're not good."

That's like picking up a guitar, fumbling through a few chords, and concluding guitars don't work.

The people getting the most out of agents aren't using better tools.

They built the skill to use them well.

Plan · Execute · Review · Verify

Different jobs. Different contexts. Different artifacts.

Planning

Create a planning artifact

  • For non-trivial work, write the plan before code
  • Use your team's format: OpenSpec, proposal, design doc, task list
  • Capture goal, scope, sequence, checks, risks, rollback point
  • Make vague thinking visible before the agent edits anything
Planning

The plan is the handoff

  • A fresh session should know what to do next
  • Record decisions, scope, and open questions
  • Record what proves each step worked
  • Keep the artifact current as reality changes
Planning

Interrogate the plan

  • Walk every branch before execution
  • What assumptions are you making?
  • What alternatives did you reject?
  • What edge cases are missing?
Planning

Chunk from the plan

  • Do one reviewable chunk at a time
  • Ship behavior through a real path
  • Verify the chunk before starting the next one
  • Keep the plan current
Planning

Give concrete context

  • README, tests, nearby code
  • The problem reproduced locally
  • Sibling implementations
  • Good repo examples when a pattern matters
Planning

Encode repeatable patterns

  • Some work repeats across many targets
  • Tracking table: what is done
  • Skill: how to do one target
  • The table tracks state. The skill preserves judgment.
Execution

Separate context by job

  • Keep the main session clean
  • Use subagents for narrow jobs
  • Review from fresh context
  • One concern per context
Execution

Do not optimize for token cost

A $0.50 prompt that gets it right is cheaper than five $0.05 prompts that each need fixing.

Use strong models when mistakes are expensive.

Execution

Do not fight polluted context

  • Compact before quality drops
  • Start fresh when the task changes
  • Bail when the agent repeats itself
  • Preserve clean context for judgment
Verification

Tell it how to verify

  • Do not just say "refactor this"
  • Ask for a check the agent can run
  • Make it run the check
  • Verification must be automated, deterministic, and runnable
Review

Separate writing from review

  • Do not make the writing session the only reviewer
  • Start a clean review session
  • Give it the diff, relevant files, and conventions
  • Ask for file, line, contract, and failing path
Review

Split review by job

  • Correctness
  • Failure modes
  • Security
  • Tests
  • Design

One reviewer, one lens.

Review

Functional is not secure

  • AI code can pass tests and still create attack surface
  • Add security as a separate review lens
  • Scan every PR
  • Treat auth, data access, and input handling as high-risk
Verification

A test that has never failed is a test you cannot trust.

A smoke detector that has never been near smoke tells you nothing about whether it works.

Verification

Mechanize invariants

  • Instructions guide behavior
  • Checks constrain behavior
  • Lint, type checks, tests, security scans, branch protection
  • If a rule can be checked mechanically, put it in the dev loop
Judgment

The human owns judgment

  • Decide what context matters
  • Name the behavior you want
  • Reject plausible nonsense
  • Protect design boundaries and domain language
Judgment

Slow down when verification is weak

  • Context spans multiple systems
  • Correctness depends on subtle behavior
  • The cost of failure is high
  • Tests do not cover the real risk
Use to navigate