User Guide

Installation and setup

The first thing you'll want to do when you open Runner for the first time is to add a project. You define a project by its absolute path. That path must correspond to a valid git repository on your local machine.

Once you've created your first project, you'll need to set your API keys. While Runner is in beta, you'll need to provide your own API keys for the LLM provider(s) you want to use. Your API keys will be stored locally and are never sent to Runner's servers. At a minimum you will need a Gemini API key. Gemini 2.5 Flash is used for the context gathering sub-agent, and Flash-Lite is used for codebase indexing.

For the main agents, GPT-5 is the default and recommended model for both the Planning and Coding agents. You'll need an OpenAI API key to use GPT-5. Claude 4 Sonnet is also a good option, but roughly twice as expensive as GPT-5. Gemini 2.5 Pro works well for the planning agent, but it struggles with making reliable tool calls, so it can be frustrating to use. It's definitely not recommended for use as the coding agent due to its poor tool call reliability. You can select the model you'd like to use for the Planning and Coding agents in the same place you set your API keys on the Settings modal.

If you don't already have a Gemini API key, you can get one here. If you use a Google account that you've never used with GCP, you'll get $300 in free credits to use over your first three months.

Repo context config

After adding a new project, you'll want to begin by hiding files and directories the agent doesn't need to see or edit. You can do this by clicking the Settings icon at the bottom of the left sidebar and then clicking on the Repo Context tab.

Some things are hidden by default, like node_modules. This keeps the context clean for the agent and reduces unnecessary token usage.

The maximum repo size currently supported by Runner (excluding hidden files) is 2M tokens. This is roughly 200k lines of code. If you want to use Runner with a repo larger than this, you may still be able to make it work by hiding non-critical parts of the codebase. For example, if you're working on a large monorepo, but you only do backend work, you could hide the frontend directory.

Where to start

The planning agent is the main point of interaction between you and the AI. The planning agent can answer questions, make suggestions, create tasks, and edit project documentation.

To get a feel for what Runner can do, try asking the planning agent some questions about how your codebase works.

Next, you can try having it spec out a task for you. Pick a tricky bug you've been stuck on and see if the planning agent can help you figure it out.

Once you have a task specced out, click the three dots in the task header and then click Start Task. This will send it to the coding agent for implementation. Before doing this, just be sure you have a clean working directory. The best defense against agents going off the rail is being able to click discard changes and just starting over.

Project documentation

Project documentation files are markdown files that can be created and edited by both you and the agent. These files live in the planning agent's context window at all times. A few suggested doc templates are created for each new project, but you can create and delete as many of these as you want.

There are two primary use cases for these. First, they can be used for documenting software architecture and design patterns for the agent. Second, they can be used as a scratchpad for planning out large features.

Spec-driven development

Runner is built around the idea of spec-driven development. This is an emerging workflow that is proving to be extremely effective when working with AI agents. The core idea is to plan out code changes collaboratively with the agent, in detail, prior to editing any code. Once you're happy with the plan, you let the coding agent implement the changes according to the spec and then test them to verify it all works. With proper planning like this, the coding agent can usually one-shot the change.

This is a more deliberate and controlled way of programming with AI than just directly prompting for code changes. It adds a little more time upfront, but dramatically reduces the likelihood of getting stuck in debugging spirals later.

Planning tasks

When planning out new tasks, it's best to start by clearing the conversation history, unless the task is very closely related to a previous task. Share as much detail as possible with the agent before asking it to create a task. If you already have an idea for how to implement it, be sure to include that.

Try to keep tasks manageable in size. Tasks that require ~200 lines or fewer in code changes tend to work best. For large refactoring or features, plan them out with the planning agent in the chat first, and then ask the agent to break it up into multiple smaller tasks.

The agent has a create_task tool it can use to create a new task, but you can also manually create a task. This is helpful when you already have a good idea of what needs to be done. Go ahead and add as much detail as you can to the task spec, and then ask the agent to finish it.

Reviewing code

Once the coding agent completes a task, you'll want to review the changes. A good first step is to ask the planning agent to review the changes and verify they match the spec. They usually will, but occasionally the planning agent will find something the coding agent missed, especially for larger tasks. On the task detail view, there's a toggle between the task spec and the diff viewer, so you can review diffs right there.

If you want to modify something, you can just directly ask the coding agent to do so.

Once you're happy with the changes, you can click the three dots in the task header bar and then click Approve Changes. This will automatically state and commit the changes, using the task title as the commit message.

Managing context

Our general philosophy is that more context leads to better results. For that reason, we never provide partial files to the agent. This is the cause of so many failures in other agentic coding tools like Claude Code and Cursor. We use complete files, and even entire directories at times.

In addition to the full content of all relevant files, we also provide the agent with complete interfaces (imports, function signatures, etc.) for the entire codebase. This gives the agent a bird's eye view of the codebase, which lets it see how everything is connected and gives it a ton of information about architectural and design patterns.

All project docs and non-archived tasks are provided to the agent with every request. The agent is also provided with the current UI context, so if you have a specific code file or task open the agent will be aware of that.

Runner manages context automatically, but you can override this and manually add or delete files and directories from the agent's context by typing "@" in the chat input. Context resets when the conversation is cleared.

If your full codebase fits into the context window of the model you're using, you can put the entire thing into context by typing "@/" and then hitting Enter. While this generally isn't necessary, for certain tasks there's no substitute. Use it sparingly (because it's expensive), but don't be afraid to use it when you need it.

Expected costs

The planning agent will contribute the most to cost because it uses the most tokens (due to the extensive codebase context we give it). You can view your actual costs on the Usage tab of the settings modal to track your spending.

Creating and executing a task should cost around $1 on average if you're using the default models (GPT-5 for main agents, Gemini for supporting tasks). If you're using Runner consistently for an entire workday, expect $10-15/day in spend.

When to use Runner (and when not to)

Runner is designed for working on production codebases where you need to stay in control and keep a close eye on code quality, while still working fast. It's very good at identifying and following existing architectural patterns. It's also very good at challenging debugging tasks due to the large amount of context the agent has about the codebase. With that said, it's not the right choice for every coding task.

For quick changes that don't require a lot of codebase context, Claude Code tends to be faster. When you're working on a key file that you know like the back of your hand, manually editing (with the help of autocomplete via Cursor or similar) will give you more control than an agentic solution like Runner.