OpenClaw Best Practices

Simi included in

2026-04-10 2616 words 13 minutes

Contents

Why This Article Was Rewritten

OpenClaw is impressive, but the hard part of running it as a stable, always-on, remotely reachable agent is rarely “how many channels does it support?” The real challenges are operational:

Are you using the official install and upgrade path?
Who can actually reach your Gateway?
Are your skills, plugins, browser, and nodes expanding the blast radius?
Can you validate and roll back quickly after changes?

This article does two things:

It distills a deployment and operations baseline from the official documentation.
It adds sanitized lessons from real deployments, focusing on reusable failure modes and fixes.

The Short Version

If you only remember one page, make it this one.

In production, prefer the official install flow plus openclaw onboard --install-daemon. Do not use ad-hoc source builds as your everyday upgrade path.
One Gateway should map to one trust boundary. Personal assistants, team assistants, and public-facing bots should not share one high-privilege runtime.
Keep gateway.bind: "loopback" by default. For remote access, prefer SSH tunnels or Tailscale Serve instead of directly listening on a public or LAN interface.
Use dmPolicy: "pairing" for DMs, requireMention: true for groups, and session.dmScope: "per-channel-peer" when multiple people can message the bot.
Treat skills and plugins as code execution surfaces. Use allowlists where possible, and pin versions when possible.
Prefer openclaw update for upgrades, and always follow with openclaw doctor, openclaw gateway restart, and openclaw health.
Health checks must go beyond “the process is alive.” Validate channel ingress, end-to-end messaging, log redaction, and session/state file permissions.

The Official Deployment Baseline

1. Installation and onboarding: start with the official path

The official docs are clear about the recommended bring-up flow:

        
curl -fsSL https://openclaw.ai/install.sh | bash
openclaw onboard --install-daemon
openclaw gateway status
openclaw dashboard

Three details matter here:

Node 24 is the preferred runtime. Node 22.16+ is supported, but 24 remains the documented recommendation.
openclaw onboard --install-daemon is the preferred setup path because it wires together models, gateway config, and common auth flows.
A source checkout is much better suited for development and debugging than for routine production upgrades.

If your goal is a stable OpenClaw deployment rather than upstream development, avoid turning “a source build that happens to work” into your long-term operational baseline.

2. Define the trust boundary before you scale to multiple agents

OpenClaw’s official security model is not “hostile multi-tenant SaaS.” It is a personal-assistant model with a single trusted operator boundary.

That means:

One Gateway is best used for one trusted operator boundary.
If you have a personal assistant, a team assistant, and a public-facing assistant, split them into separate Gateways, or at minimum separate OS users and hosts.
sessionKey, session labels, and channel bindings are routing mechanisms, not authorization boundaries.

This is where many deployments go wrong early. Per-user sessions help with context isolation, but they do not magically isolate high-impact tools, credentials, browser state, or host access.

3. Network exposure: default to loopback, then tunnel

The safest default in the docs is simple: keep the Gateway on loopback and expose it to yourself through a tunnel or tailnet.

A reasonable production starting point looks like this:

        
        
        
    
{
  "gateway": {
    "mode": "local",
    "bind": "loopback",
    "port": 18789,
    "auth": { "mode": "token", "token": "${OPENCLAW_GATEWAY_TOKEN}" }
  },
  "session": {
    "dmScope": "per-channel-peer"
  },
  "channels": {
    "telegram": {
      "dmPolicy": "pairing",
      "groups": {
        "*": { "requireMention": true }
      }
    }
  }
}

The important parts:

gateway.bind: "loopback" is the default hardened posture.
Explicitly set gateway.auth.mode: "token" or "password". Do not assume localhost is enough.
If you use a non-loopback bind, you also need real auth and real firewalling.
If you put the Gateway behind a reverse proxy, configure gateway.trustedProxies correctly and make sure the proxy overwrites forwarding headers instead of appending untrusted ones.

For remote use from another machine, the simplest SSH tunnel remains:

ssh -N -L 18789:127.0.0.1:18789 user@host

Then keep your local CLI or UI pointed at ws://127.0.0.1:18789.

One subtle but important note from the docs: gateway.remote.token and gateway.remote.password are client-side credential sources for connecting to a remote Gateway. They do not configure server-side auth by themselves.

4. DMs, groups, and context isolation: start conservative

If your bot is not exclusively for yourself, resist the urge to start with everything open.

The official guidance is conservative for good reason:

Use dmPolicy: "pairing" by default.
Use requireMention: true by default in groups.
If more than one person can DM the bot, use session.dmScope: "per-channel-peer".
For stricter setups, add allowlists instead of letting trust depend on habit.

This is not only about security. It also improves debuggability. Many “the bot stopped replying” incidents turn out to be pairing, allowlist, or mention-gate behavior doing exactly what it was configured to do.

5. Skills, plugins, browser control, and nodes are all high-impact surfaces

OpenClaw is powerful, and power expands risk.

Several official security conclusions are worth treating as default assumptions:

Skills and plugins should both be treated as code execution surfaces.
Browser control is effectively operator access to whatever that browser profile can reach.
A paired node with system.run is remote execution.
Prompt guardrails are soft. The hard boundary comes from allowlists, tool policy, sandboxing, exec approvals, and host isolation.

In multi-agent deployments, it is worth explicitly assigning different access levels by agent:

Personal high-privilege agent: private only, sandbox optional if you truly own the full trust boundary.
Team or family agent: sandboxed, read-only workspace, reduced tools.
Public-facing agent: no filesystem, no shell, no browser, no cron, no gateway tool.

6. Config management: strict validation is a feature

OpenClaw config is schema-validated strictly. Unknown keys, wrong types, and invalid values can prevent the Gateway from starting.

Operationally, this is a good thing. It prevents “config was saved but only half of it actually worked.”

Recommended habits:

When adding new config, check the official Configuration docs and schema.
Run openclaw doctor after config changes.
Know which changes hot-apply and which changes restart the Gateway. The default reload mode is currently hybrid.
Avoid spreading long-lived configuration across ad-hoc shell flags.

Sanitized Lessons from Real Deployments

The following issues all came from real deployments, but only the reusable parts are preserved here.

1. Do not confuse the published bundle with your source tree

This was one of the most expensive mistakes operationally.

The pattern usually looks like this:

A published release has a runtime bug.
You inspect the source tree and the fix looks obvious.
You push a local source build or tarball into production, and now you have new bundle-layout drift, missing runtime dependencies, or behavior that no longer matches the published release.

The root cause is simple: production runs the published artifact, not your working source tree. Bundle names, packaging boundaries, embedded dependencies, and plugin layout can differ.

The safer pattern is:

If the issue is local to a published bundle, patch the working published artifact minimally, reversibly, and with a clear record.
Move back to the official upgrade path once an upstream release contains the fix.
Only adopt a source-checkout workflow if you intentionally choose to maintain that machine as a self-managed build channel.

In short: when fixing production, target what production is actually running, not the most convenient source directory on your laptop.

2. A slash command can register successfully and still fail at runtime

Another recurring failure mode is when a skill appears in the slash-command surface, but executing it behaves as if the skill were missing.

The most common cause is a mismatch between command visibility and model visibility:

user-invocable: true exposes the skill as a slash command.
disable-model-invocation: true removes it from the model prompt.

If the command still depends on the model reading the SKILL.md instructions, you get a command that is visible but unreliable.

The more stable approaches are:

If the command is really a direct tool invocation, use command-dispatch: tool.
If it still needs the model to read the skill instructions, do not hide it from the model prompt.
Add end-to-end smoke tests for critical slash commands instead of only verifying that they appear in the command list.

3. A workspace skill being discovered does not mean the model received the full instructions

There is a second-order version of the same problem in some versions or custom runtimes: the skill is visible in the registry, but the runtime only gives the model a weak hint such as “use this skill,” instead of reliably injecting the actual SKILL.md body.

At that point, execution quality depends too heavily on model improvisation.

The practical lessons:

Do not rely too much on “the model will probably read it later” for critical skills.
If a skill must be deterministic, either dispatch directly to a tool or ensure the runtime really injects the skill content into the prompt.
After changing a skill, prefer a new session or at least a reset instead of assuming the active session always picked up the new snapshot.

The official docs mention session snapshots and skill watchers. Hot reload exists, but for critical paths, “open a new session and verify once” is still the safer habit.

4. Containerized deployments drift quietly

A container starting successfully is only step one. The harder problem is configuration and runtime drift.

Typical issues I have seen:

Host-fallback scenarios and sandbox scenarios get mixed into one compose file until it becomes impossible to reason about.
Plugin paths point at source directories instead of compiled output directories, creating duplicate dependency or runtime conflicts.
Bootstrap scripts do not pin the default model explicitly, so a stale .env or .env.example ends up winning.

The more stable pattern is:

Keep the base docker-compose.yml as generic and sandbox-compatible as possible.
Put host-specific differences into separate override files.
In containers, point bundled plugins at compiled output, not source trees.
Explicitly write critical defaults such as agents.defaults.model.primary instead of assuming template env files are always current.

5. Skill dependencies must be verified on both the host and the sandbox

The official Skills documentation contains an important nuance: requires.bins is evaluated on the host at load time, but if the agent actually runs inside a sandbox, the same binary must also exist inside the container.

That means:

“The host has the command” does not mean “the sandboxed skill will work.”
Missing container binaries often show up as skills that are visible but fail at execution time.
The stable fix is to install required binaries through setupCommand or a custom sandbox image.

If a skill depends on both external binaries and network access, you should also validate sandbox network policy, filesystem writeability, and runtime user permissions.

6. Health checks cannot stop at “the Gateway process is alive”

This is another common misdiagnosis.

Seeing a green health endpoint, a healthy container, or a running daemon only proves one thing: the Gateway process exists.

For OpenClaw, a more trustworthy health model has four layers:

Process layer: openclaw gateway status.
Config layer: openclaw doctor.
Channel layer: openclaw channels status --probe or the equivalent probe path.
Business layer: send a real test message through a real channel and confirm routing plus reply path end to end.

If you only do the first two, you will miss the failures that matter most in practice: expired tokens, broken channel sessions, allowlist mistakes, or mention-gate mistakes.

7. Logs, sessions, and state directories should all be treated as sensitive

The official docs are very explicit here: ~/.openclaw/ contains much more than “just config.” It may include auth profiles, pairing allowlists, session transcripts, channel credentials, secrets payloads, and sandbox workspaces.

At minimum:

Keep ~/.openclaw at 700.
Keep ~/.openclaw/openclaw.json at 600.
Keep logging.redactSensitive: "tools" enabled.
When sharing diagnostics, prefer sanitized outputs such as openclaw status --all instead of raw logs.
On a shared machine, give OpenClaw its own OS user whenever possible.

Long-Term Skills and Plugin Maintenance

This is the layer most likely to devolve into chaos over time.

Be explicit about skill locations and precedence

The current official skill precedence is:

<workspace>/skills
<workspace>/.agents/skills
~/.agents/skills
~/.openclaw/skills
bundled skills
skills.load.extraDirs

My recommendation:

Put project-specific skills in <workspace>/skills.
Put machine-wide shared skills in ~/.agents/skills or ~/.openclaw/skills, but avoid duplicating the same skill name across both.
Maintain clear ownership for critical skills so “same name, different copy” does not become an investigation nightmare.

Use agent allowlists to control visibility

A skill being discoverable is different from a skill being available to a given agent.

For public-facing or lower-trust agents, use explicit skill allowlists instead of letting them inherit everything by default.

Plugins deserve more caution than skills

Plugins have a larger blast radius than ordinary skills because they run in-process with the Gateway.

At minimum:

Install only from trusted sources.
Prefer exact version pinning over drifting ranges.
Restart the Gateway after plugin installs or upgrades.
Run openclaw security audit periodically to review plugin scanning and allowlist posture.

Upgrades, Rollbacks, and Change Management

The recommended upgrade flow

The currently recommended upgrade path is:

openclaw update

It attempts to detect your install type, update accordingly, and run post-update checks. A more explicit human-driven workflow can be:

        
openclaw update --dry-run
openclaw update
openclaw doctor
openclaw gateway restart
openclaw health

Useful details:

--dry-run is worth using when you want to inspect the change before applying it.
--channel beta and --tag beta are not the same. The former is a channel preference, the latter is closer to forcing a specific dist-tag.
Auto-update is off by default. Whether to enable it in production depends on how tightly you control maintenance windows.

Plan rollback before you need it

If you installed through npm, the official rollback path is straightforward:

        
npm i -g openclaw@<version>
openclaw doctor
openclaw gateway restart

Pinning to a source commit is something to keep for source-checkout deployments. In practice that means:

npm or install-script deployments are a better fit for stable production plus version rollback.
Source-checkout deployments are a better fit for self-managed development channels.
Avoid switching back and forth between those two models on the same production host.

An Operations Checklist You Can Actually Use

Here is the checklist I find most practical.

Before launch

Confirm the install path is officially supported, not an improvised source build.
Confirm gateway.bind is still loopback, or that token/password auth, firewalling, and trusted proxy settings are correct.
Confirm DMs are not wide open and group mention gating is enabled.
Confirm skill and plugin sources are known and critical skills have smoke tests.
Confirm browser automation uses a dedicated profile, not your personal daily browser.
Run openclaw security audit.

After every update

Run openclaw doctor.
Run openclaw gateway restart.
Run openclaw health.
Run a channel probe.
Send a real message through a real chat surface and verify the full round trip.

When something breaks

Decide whether this is a process problem, config problem, channel problem, or application-logic problem.
Do not blindly overwrite the live runtime. First diagnose the actual artifact that production is running.
Use sanitized outputs when sharing diagnostics. Do not paste raw transcripts or credential paths by default.
If exposure is suspected, return to loopback, tighten DM/group policy, and only then continue the investigation.

Closing Thoughts

The real value of OpenClaw is not just that it connects an LLM to Telegram or WhatsApp. It is that it brings models, tools, channels, state, and automation into a runtime you control.

That is also why it must be operated like a long-lived system instead of a toy chat integration:

Hold the official install and update path steady.
Hold the hard boundaries steady: loopback, pairing, mention gating, tool policy.
Then expand into higher-capability surfaces such as skills, plugins, browser control, and nodes.

If you get those three layers right, OpenClaw starts to feel like a dependable self-hosted agent platform rather than an impressive but fragile experiment.