Working with coding agents
What I have observed working well with AI coding agents during 2025 and practical tips to get the most out of them
Topics
- Prompting
- Context
- Self verification
- Models (and selection)
- Background Agents
- Parallel Agents
Prompting
Everything begins with the prompt. This is the primary way you shape how an AI agent behaves. Prompting is how you communicate intent, share your mental model, and provide the state of the world as you see it. A good prompt does not just ask for an outcome, it frames the problem clearly and gives the agent enough signal to proceed in the right direction.
When writing prompts, consider the environment the agent is operating in. Does it have access to tools? Can it perform actions to gather additional information? If not, you are responsible for frontloading that context. For example, if an agent cannot run tests on its own, you will need to manually execute them and provide the results. But if the agent does have terminal access, it can handle the entire loop by itself, including running and iterating on test cases.
Take the function add(a, b)
as a simple case. If tool access is unavailable, your prompt might be:
Write a test for a function `add(a, b)` that sums two positive numbers.
You would then manually run it and share the results. If tool access is available, you can say:
Write a test for `add(a, b)` that sums two positive numbers and run the test. Iterate until it passes.
This small change in environment leads to a significant change in how you prompt and how much work you offload to the agent.
This principle applies to many environments, such as access to CI logs or browser runtimes. The more context the agent can independently gather, the more useful and capable it becomes.
Context
Every new agent session begins with minimal memory. Think of it as onboarding a new engineer to your team. If you want them to deliver high-quality work, you need to provide the right background. When you give an agent clear instructions, workflows, and edge cases, it behaves more autonomously and makes fewer mistakes.
Providing proper context reduces back-and-forth interactions and avoids unnecessary errors. For example, if your team always runs a linter or compiles code after changes, include that expectation in your prompt. This allows the agent to verify its own work and prevents small issues from slipping through.
Compare these two approaches:
Without context:
Add a new API endpoint for user authentication.
With context:
Add a new API endpoint for user authentication at `/api/auth/login`. Follow our existing pattern in `auth.controller.ts` where we use JWT tokens and return a 401 for invalid credentials. After implementation, run `npm run lint` and `npm test` to verify. All endpoints should have error handling middleware.
Your prompts should vary based on the size of the task. For small tasks, a brief prompt such as “verify tests and fix failures” may be sufficient, especially if the surrounding code gives the agent enough to reason about. For larger efforts, more detailed prompts are necessary. These often take the form of structured plans that outline goals, steps, and constraints. Creating and reviewing such a plan with the agent can help establish a clear path before execution, allowing the agent to work more independently once the scope is aligned.
Self verification
Allowing agents to verify their own work is one of the most effective ways to improve quality and reduce manual oversight. The simplest form of this is through unit tests. But it can also include compiling code, running integration tests in the browser, or checking lint rules. If the agent has access to these tools, you should instruct it to use them regularly.
Here’s an example prompt that includes verification steps:
Refactor the `UserService` class to use dependency injection. After making changes:
1. Run `npm run build` to ensure it compiles
2. Run `npm test` and fix any failing tests
3. Run `npm run lint:fix` to catch style issues
4. If any step fails, iterate until all checks pass
It is also valuable to formalize the techniques that work well. Define rules that capture recurring expectations or common edge cases. These act as behavioral guidelines the agent can consistently follow. You can also define commands that automate routine flows. For example, using a command like /add-new-service
can set up boilerplate and provide instructions for adding a new microservice. By turning proven strategies into reusable prompts, you create a system that gets smarter and faster over time.
Models (and how to select)
Choosing the right model plays a big role in shaping the development experience. There are generally two categories to consider: one is slower but more intelligent, the other is faster but slightly less capable. I used to favor the more intelligent models, but I have found more satisfaction in faster iteration cycles.
This insight is closely related to something I wrote about in my post about avoid the copilot pause. The moment you lose momentum because a model is taking too long, your attention takes a hit. This is why Tab
feel so powerful. It keeps you in flow by offering rapid, context-aware completions without the wait.
If you do use a slower model, the key is to let it run in the background with detailed instructions. You need to give it a solid plan, clear validation steps, and enough autonomy so it can make progress without constant input. Once the agent has enough information, it can work uninterrupted for long stretches and return with higher-quality results.
Background Agents
Once you master detailed prompting, you can begin treating agents as background workers. This is where things become truly scalable. You provide the objective, describe the environment, explain how the agent should verify its work, and let it execute on its own.
Here’s an example of a prompt designed for autonomous background execution:
Implement pagination for the products API endpoint.
**Context:** We use cursor-based pagination across all APIs. See `orders.controller.ts` for reference implementation.
**Requirements:**
- Add `limit` (default 20, max 100) and `cursor` query parameters
- Return `nextCursor` in response when more results exist
- Update the existing `/api/products` endpoint
- Maintain backward compatibility for clients not using pagination
**Verification:**
1. Write integration tests covering edge cases (empty results, last page, invalid cursor)
2. Run full test suite and ensure all pass
3. Test manually using curl commands with various parameters
4. Update API documentation in `docs/api.md`
Work autonomously and only notify me when complete or if you encounter blockers.
This frees up your time to focus on higher-level decisions while the agent completes a scoped, well-defined task in parallel. The better your planning, the less hand-holding the agent requires. With strong prompts and scoped autonomy, the agent becomes a reliable part of your workflow instead of a tool that needs constant attention.
Parallel Agents
Running multiple agents at the same time is a powerful way to accelerate your output. However, it introduces a new challenge: context switching. You now have to track several threads of work in your head, just as you would with human teammates.
My personal sweet spot is running between one and three agents in parallel. This gives me a strong boost in throughput while still allowing me to keep track of what each one is doing. To make this work, each agent needs to be scoped to a clearly defined, non-overlapping task. If responsibilities are unclear or shared across agents, you will run into the same coordination issues you would with a team of engineers.
Here’s an example of how you might scope work across three parallel agents:
Agent 1:
Add rate limiting middleware to all API endpoints. Use `express-rate-limit` with Redis. Apply 100 requests per 15 minutes for authenticated users, 20 for anonymous. Test and document.
Agent 2:
Implement email notifications for password resets. Create email templates in `templates/email/`, integrate with SendGrid, add tests. Don't modify auth endpoints, just create the notification service.
Agent 3:
Update the frontend dashboard to show real-time user activity. Use websockets (we have socket.io set up). Only touch `dashboard.tsx` and create new hook `useActivityFeed.ts`. Backend endpoint already exists at `/api/activity/stream`.
Isolating work streams allows agents to operate independently and prevents conflicts. With good scoping, you get the benefits of parallelism without the overhead of micromanagement.
This can be done both remotely and locally!
Final thoughts
The real power comes from understanding what’s being abstracted away. When you grasp the underlying principles of how prompting shapes behavior, why context matters, and what verification actually accomplishes, you can approach any AI tool from first principles.
Let me know what you think!