Contents

OpenClaw With Copilot CLI Best Practices

OpenClaw can run external coding harnesses through ACP Agents, and GitHub Copilot CLI has evolved from a terminal helper into a full agentic coding CLI with planning, delegation, review, autopilot, and subagents.

The reliable setup is not “turn everything on.” It is to separate integration mode, permissions, session boundaries, and authentication before you delegate real work.

Provider mode and ACP mode are not the same

There are at least two very different ways to involve GitHub Copilot in OpenClaw:

  1. Model provider mode: OpenClaw uses GitHub Copilot directly as a model provider through openclaw models auth login-github-copilot.
  2. ACP harness mode: OpenClaw starts an external Copilot CLI session through ACP, and Copilot CLI itself reads code, edits files, and runs commands.

This article is about the second path: OpenClaw orchestrating Copilot CLI through ACP.

The official docs are clear:

  • OpenClaw ACP Agents can run external coding harnesses, including Copilot.
  • The acpx project currently includes a built-in copilot alias that maps to copilot --acp --stdio.

In this architecture, OpenClaw is the orchestrator and Copilot CLI is the execution layer.

Baseline first: make sure ACP is healthy

The OpenClaw ACP flow is closer to this:

  1. confirm ACP is enabled
  2. run /acp doctor
  3. spawn a specific harness
  4. choose whether to bind it to the current chat or a dedicated thread

Typical commands:

/acp doctor
/acp spawn copilot --bind here

Or:

/acp spawn copilot --mode persistent --thread auto

For remote or longer-running work, --thread auto is usually the better default.

Authenticate Copilot CLI itself

If you use the ACP harness path, the thing that must be authenticated is Copilot CLI itself, not only OpenClaw’s GitHub Copilot provider integration.

The current GitHub command reference documents two common paths:

  • run copilot login
  • provide a token via COPILOT_GITHUB_TOKEN, GH_TOKEN, or GITHUB_TOKEN

GitHub also documents several important constraints:

  • fine-grained PATs are supported if they include Copilot Requests
  • Copilot CLI app OAuth tokens are supported
  • gh app OAuth tokens are supported
  • classic ghp_ PATs are not supported

For remote or headless OpenClaw deployments, environment-based authentication is often more reliable than interactive device login.

OpenClaw should orchestrate, Copilot CLI should execute

A practical split:

  • OpenClaw: chat entry point, thread routing, remote follow-up, session persistence, background orchestration
  • Copilot CLI: repository exploration, planning, coding, testing, review

That also changes how you should prompt. Do not say “fix this” and hope for the best. Send tasks in the structured, executable style GitHub recommends.

Example:

Read the files related to payment retry first. Do not edit yet.
Then propose a minimal plan, the files you expect to change,
the files you will not change, and the tests you plan to run.
Wait for approval before implementing.

Default to planning first

GitHub now treats /plan as a core Copilot CLI workflow and explicitly recommends an explore -> plan -> code -> commit path.

A practical default flow:

  1. spawn a Copilot CLI session
  2. first message: read code and explain the current state
  3. second message: produce a plan
  4. approve boundaries
  5. then implement
  6. end with tests, diff summary, and risks

Rubber Duck Belongs At Review Checkpoints

GitHub Docs describes Rubber Duck as a built-in critic agent for Copilot CLI. The main agent can hand its current plan, design, implementation, or tests to Rubber Duck for review; Rubber Duck looks for blind spots, design flaws, and substantive issues, then returns actionable feedback to the main agent.

GitHub’s April 6, 2026 blog post explains the design as a cross-family second-opinion mechanism: the critic model is deliberately different from the model driving the main session.

GitHub’s May 7, 2026 changelog then expanded the feature:

  • Claude sessions can use GPT-5.5 as the Rubber Duck reviewer
  • GPT sessions can also get a Claude reviewer
  • the feature is enabled through /experimental on

Rubber Duck is not another coding agent that edits files directly. It reviews proposed changes; the main session agent decides whether and how to act on the critique.

In OpenClaw + Copilot CLI workflows, the best times to use it are:

  1. right after planning
  2. after multi-file implementation work
  3. after tests are written
  4. before final review on high-risk changes

Long sessions are fine, but task threads should stay narrow

GitHub documents “infinite sessions” with automatic context compression, but the same docs also recommend keeping sessions focused and using /clear or /new for unrelated work.

Recommended discipline:

  • one repository, one feature, or one bug per ACP thread
  • do not mix CI fixes, docs work, and architecture refactors in one bound session
  • use --thread auto for clearly separate tasks
  • reuse a bound session only when the task truly continues the same line of work

Permission strategy: conservative by default

OpenClaw’s ACP setup docs are explicit:

  • permissionMode controls whether the harness can auto-run writes and shell commands
  • the default is approve-reads
  • the default nonInteractivePermissions value is fail

If you want --yolo-like behavior, do it inside a disposable environment, container, or dedicated dev box, not on your main machine.

Under cost and quota pressure, split the models by role

This is the real production pattern.

Copilot CLI should not be your default conversational model for everything:

  • it has usage and premium request constraints. GitHub Docs currently says each prompt to Copilot CLI uses one premium request with the default model, while other models are multiplied by the model rate
  • its value is highest on coding execution
  • spending Copilot quota on routine conversation wastes the expensive layer

There is also a near-term billing change to account for: GitHub’s Copilot plan and usage docs state that Copilot is moving from request-based billing to usage-based billing on June 1, 2026. The recommendation should therefore not depend on a fixed allowance number. The durable rule is to reserve the expensive coding agent for high-value coding checkpoints.

OpenClaw’s multi-agent model maps naturally to a better design because each agent has its own workspace, auth profile, model registry, and session store.

The strongest practical pattern is:

  1. everyday chat agent: use a cost-efficient model such as MiniMax, Qwen, DeepSeek, or OpenRouter with openrouter/auto
  2. coding execution agent: invoke Copilot CLI only through ACP when actual repository work starts
  3. high-value checkpoints: let Copilot CLI handle planning, implementation, testing, and review, not all conversation

In short:

  • OpenClaw’s main chat model handles requirements, summaries, triage, and follow-up
  • Copilot CLI handles repository exploration, plans, code changes, tests, and review

Let Copilot CLI review its own work, then do human review

A stable workflow is:

  1. OpenClaw routes the task into a Copilot CLI session
  2. Copilot CLI implements the work
  3. you ask it to summarize changed files, surface risks, run tests, and review its own output
  4. then you do human review

--bind here vs --thread auto

Use --bind here when the task is short and you want tight back-and-forth in the current chat.

Prefer --thread auto when the task will run for a while, when you plan to come back later, or when you want to separate requirement discussion from execution.

A stable runbook

1. Prepare the host

copilot login
copilot

2. Check ACP health

/acp doctor

3. Create a session

Short task:

/acp spawn copilot --bind here

Longer task:

/acp spawn copilot --mode persistent --thread auto

4. First message: exploration and planning only

Read the relevant files and explain the current implementation first.
Do not change code yet. Then propose a minimal plan, expected scope,
and validation commands.

5. Second message: implementation

Implement the approved plan.
Afterward, run the relevant tests and summarize the diff, risks,
and follow-up suggestions.

6. Finish with explicit review

Before you close the loop, ask for changed files, unverified areas, and a self-review or /review.

Key takeaways

  1. Distinguish provider mode from ACP + Copilot CLI.
  2. Let OpenClaw orchestrate and let Copilot CLI execute.
  3. Default to explore, then plan, then implement.
  4. Use Rubber Duck at high-value review checkpoints.
  5. Keep ACP threads narrow even when sessions are long.
  6. Design non-interactive ACP permissions deliberately.
  7. Use cheaper models for routine chat and Copilot for coding.
  8. Reserve high autonomy for isolated environments.
  9. Require tests and review before human acceptance.

References