Copilot Agent Mode: From "Autocomplete" to "Do the Task"

In This Article

What Copilot "agent mode" actually is
What it's great at (and where it still struggles)
The safest agent mode workflow
Copy/paste: A "Repo Instructions" file
Example: Use agent mode to fix a real bug
Model choice + cost considerations

There's a reason "agent mode" is the phrase you keep hearing: it's not a feature tweak—it's a shift in what the assistant is responsible for.

Autocomplete makes suggestions
Multi-file edits propose diffs
Agents finish tasks

GitHub describes agent mode as something that can iterate on its own output, recognize errors, suggest commands, and keep going until the subtasks are complete—rather than stopping at the first response.

What Copilot "agent mode" is (in plain terms)

Agent mode is best understood as a loop:

Understand the goal
Plan steps
Edit code across files
Validate by running / checking results
Fix errors
Repeat until done
Hand you a reviewable diff

The Key Difference

Unlike autocomplete, agent mode catches its own errors and iterates. It suggests terminal commands for you to run and keeps going until the task is actually complete.

What it's great at (and where it still struggles)

Agent Mode Strengths vs Risks

✅ Great At	⚠️ Still Risky At
Refactors across multiple files	Silent scope creep (touching unwanted files)
Migrations (old API → new API)	Security-sensitive edits (auth, crypto, payments)
Adding tests + iterating from failures	Big dependency upgrades without guardrails
Generating boring glue code	"Plausible" code that compiles but violates logic
Creating "first pass" implementation	—

Rule of Thumb

Let the agent do the work, but keep humans owning "what is correct behavior."

Getting started: the safest agent mode workflow

Step 1: Give the agent a contract (acceptance criteria)

Don't say: "Add OAuth."

Do say:

Add GitHub OAuth login.

Acceptance criteria:
- /login redirects to GitHub
- callback exchanges code for token
- user session persists for 7 days
- unit tests cover: happy path + invalid callback code
- no secrets committed; uses env vars

Constraints:
- Only edit files in /src and /tests
- Use existing session middleware
- Do not add new dependencies

Step 2: Force a plan first

This is the best single trick for higher-quality agent work:

Before changing any code:
1) Summarize the current auth flow
2) Propose a step-by-step plan
3) List exact files you expect to touch

Wait for my confirmation.

Step 3: Use a "working set" mindset

Even when a tool can see your whole repo, constrain the blast radius:

"Only change these 5 files"
"Do not touch formatting"
"Do not change public APIs unless necessary"

Copy/paste: a "Repo Instructions" file that makes agents behave

Create a file at .github/copilot-instructions.md:

# Copilot instructions (project rules)

## General
- Prefer minimal diffs; avoid refactors unless requested
- Keep changes scoped to the files requested
- Do not add dependencies unless explicitly asked
- Always update/add tests for behavior changes
- Never commit secrets or .env files

## Coding style
- Use existing patterns in the repo
- Match formatting/lint rules
- Prefer early returns, no nested complexity

## When running commands
- Suggest commands, but do not assume they ran
- Prefer: npm test / pytest / cargo test based on repo
- If tests fail, propose the smallest fix

Example: Use agent mode to fix a real bug

Sample Bug Fix Prompt

Here's a real-world example of how to prompt agent mode effectively:

Bug: navigation HTML is injected repeatedly into generated pages.

Task:
- Fix the generator so the navigation is inserted once
  at the {nav} placeholder only.
- Regenerate docs pages after the fix.
- Add a unit test that fails on repeated insertion.

Constraints:
- Only edit generate.py and tests under /tests
- No new dependencies

Definition of done:
- Unit tests pass
- Running generator outputs pages with exactly 1 nav section

What "good agent behavior" looks like:

It inspects where the replacement occurs
It updates the replacement logic to target a placeholder token
It adds a test: "nav count == 1"
It suggests: python -m pytest
It iterates if test fails

The underrated part: model choice + cost

GitHub's docs are clear: Copilot supports multiple models, and model choice affects quality, latency, and hallucinations. It also notes premium request multipliers that can change how fast you burn through usage.

Model Selection Guide

Task Type	Recommended Approach
Explain this file / Rename variables	Fast/cheap model
Debug failing tests	Stronger reasoning model
Agentic multi-file task	Only with clear acceptance criteria

Where this is going: multi-agent development

VS Code's multi-agent orchestration describes:

Using Copilot + custom agents together
Background agents in isolated workspaces
Custom agents defined in .github/agents
Integrated agent sessions in the Chat view

That hints at the next default workflow:

One agent writes code
Another runs tests
Another reviews diffs for security pitfalls
You approve and merge

Key Takeaways

Agent mode is a loop: plan → edit → validate → fix → repeat
Always provide acceptance criteria and constraints
Force a plan before any code changes
Use a repo instructions file for consistent behavior
Choose models based on task complexity vs cost