- What Copilot "agent mode" actually is
- What it's great at (and where it still struggles)
- The safest agent mode workflow
- Copy/paste: A "Repo Instructions" file
- Example: Use agent mode to fix a real bug
- Model choice + cost considerations
There's a reason "agent mode" is the phrase you keep hearing: it's not a feature tweak—it's a shift in what the assistant is responsible for.
- Autocomplete makes suggestions
- Multi-file edits propose diffs
- Agents finish tasks
GitHub describes agent mode as something that can iterate on its own output, recognize errors, suggest commands, and keep going until the subtasks are complete—rather than stopping at the first response.
What Copilot "agent mode" is (in plain terms)
Agent mode is best understood as a loop:
- Understand the goal
- Plan steps
- Edit code across files
- Validate by running / checking results
- Fix errors
- Repeat until done
- Hand you a reviewable diff
Unlike autocomplete, agent mode catches its own errors and iterates. It suggests terminal commands for you to run and keeps going until the task is actually complete.
What it's great at (and where it still struggles)
| ✅ Great At | ⚠️ Still Risky At |
|---|---|
| Refactors across multiple files | Silent scope creep (touching unwanted files) |
| Migrations (old API → new API) | Security-sensitive edits (auth, crypto, payments) |
| Adding tests + iterating from failures | Big dependency upgrades without guardrails |
| Generating boring glue code | "Plausible" code that compiles but violates logic |
| Creating "first pass" implementation | — |
Let the agent do the work, but keep humans owning "what is correct behavior."
Getting started: the safest agent mode workflow
Step 1: Give the agent a contract (acceptance criteria)
Don't say: "Add OAuth."
Do say:
Add GitHub OAuth login.
Acceptance criteria:
- /login redirects to GitHub
- callback exchanges code for token
- user session persists for 7 days
- unit tests cover: happy path + invalid callback code
- no secrets committed; uses env vars
Constraints:
- Only edit files in /src and /tests
- Use existing session middleware
- Do not add new dependencies
Step 2: Force a plan first
This is the best single trick for higher-quality agent work:
Before changing any code:
1) Summarize the current auth flow
2) Propose a step-by-step plan
3) List exact files you expect to touch
Wait for my confirmation.
Step 3: Use a "working set" mindset
Even when a tool can see your whole repo, constrain the blast radius:
- "Only change these 5 files"
- "Do not touch formatting"
- "Do not change public APIs unless necessary"
Copy/paste: a "Repo Instructions" file that makes agents behave
Create a file at .github/copilot-instructions.md:
# Copilot instructions (project rules)
## General
- Prefer minimal diffs; avoid refactors unless requested
- Keep changes scoped to the files requested
- Do not add dependencies unless explicitly asked
- Always update/add tests for behavior changes
- Never commit secrets or .env files
## Coding style
- Use existing patterns in the repo
- Match formatting/lint rules
- Prefer early returns, no nested complexity
## When running commands
- Suggest commands, but do not assume they ran
- Prefer: npm test / pytest / cargo test based on repo
- If tests fail, propose the smallest fix
Example: Use agent mode to fix a real bug
Here's a real-world example of how to prompt agent mode effectively:
Bug: navigation HTML is injected repeatedly into generated pages.
Task:
- Fix the generator so the navigation is inserted once
at the {nav} placeholder only.
- Regenerate docs pages after the fix.
- Add a unit test that fails on repeated insertion.
Constraints:
- Only edit generate.py and tests under /tests
- No new dependencies
Definition of done:
- Unit tests pass
- Running generator outputs pages with exactly 1 nav section
What "good agent behavior" looks like:
- It inspects where the replacement occurs
- It updates the replacement logic to target a placeholder token
- It adds a test: "nav count == 1"
- It suggests:
python -m pytest - It iterates if test fails
The underrated part: model choice + cost
GitHub's docs are clear: Copilot supports multiple models, and model choice affects quality, latency, and hallucinations. It also notes premium request multipliers that can change how fast you burn through usage.
| Task Type | Recommended Approach |
|---|---|
| Explain this file / Rename variables | Fast/cheap model |
| Debug failing tests | Stronger reasoning model |
| Agentic multi-file task | Only with clear acceptance criteria |
Where this is going: multi-agent development
VS Code's multi-agent orchestration describes:
- Using Copilot + custom agents together
- Background agents in isolated workspaces
- Custom agents defined in
.github/agents - Integrated agent sessions in the Chat view
That hints at the next default workflow:
- One agent writes code
- Another runs tests
- Another reviews diffs for security pitfalls
- You approve and merge
- Agent mode is a loop: plan → edit → validate → fix → repeat
- Always provide acceptance criteria and constraints
- Force a plan before any code changes
- Use a repo instructions file for consistent behavior
- Choose models based on task complexity vs cost