OpenClaw Security Report

DOWNLOAD the full report!

The rapid adoption of OpenClaw, a popular open-source autonomous AI agent framework, reflects a broader shift toward AI-driven assistants. However, the widespread integration of this framework has historically introduced critical security risks that may lead to unauthorized actions, data exposure, and system compromise.

This report is compiled to review the representative security issues that emerged throughout the development and rapid adoption of OpenClaw, and to distill actionable security insights for the AI agent industry. Its core mission is to provide security design references for developers building similar agent systems, and to deliver clear risk awareness and mitigation guidance for end users, via actionable security recommendations from both development and deployment perspectives.

We present a comprehensive security analysis of OpenClaw’s architecture and core components, encompassing ingress categories, internal modules, supply chain inputs, and external dependencies. By diving deep into the detailed workflows, the assessment identifies inherent security weaknesses and attack surface. It evaluates the specific risks associated with each major component by analyzing representative vulnerabilities, common attack techniques, and underlying threat patterns.

This report is based on data and analysis available before March 18, 2026. Given the extremely rapid evolution of OpenClaw-style agent systems, their architectures, attack methods, and vulnerabilities are constantly shifting and have not yet reached a stable phase. Readers are advised to follow our subsequent analysis updates for the latest information.

Key Takeaways

OpenClaw's explosive growth from side projects to 300,000+ GitHub stars created massive security debt. Originally assuming a trusted local environment, its security model was rapidly outpaced by real-world deployment complexity, accumulating 280+ GitHub Security Advisories and 100+ CVEs between November 2025 and March 2026.
Historical analysis shows that the Gateway treated local network access as proof of identity, bypassing authentication checks that should have been required. Localhost origin, URL parameters, and OS app boundaries were each exploited to gain full orchestration authority - shell execution, filesystem access, browser automation, and multi-device control - making the blast radius effectively unbounded for most self-hosted deployments.
Identity binding across 20+ messaging platforms proved structurally fragile, historically producing more than 60 allowlist bypass issues. Mutable attributes used for authorization, privilege-level conflation across interaction modes, and absent webhook verification created recurring bypass paths that granted attackers access to the full execution pipeline.
Disclosed vulnerabilities repeatedly revealed divergence between policy validation and actual execution. Flag abbreviations bypassed exact-match deny lists, approved commands were not bound to file paths, and sandbox restrictions failed to propagate to child sessions or secondary endpoints - showing that enforcement must validate the final resolved form across all code paths.
Local credentials, session histories, and agent memory stored were exposed through multiple disclosed vulnerabilities related to inconsistent boundary checks across modules. Path traversal and sandbox gaps appeared independently across multiple modules because each implemented its own validation logic rather than sharing a common boundary enforcement mechanism.
The extension ecosystem became a primary supply chain attack vector at scale. Hundreds of malicious skills were found on ClawHub, alongside fake installers and lookalike npm packages. Unlike conventional supply chain attacks, agent skills can influence behavior through natural language, making them resistant to traditional scanning.
Deployment misconfiguration posed risks equal to or greater than code-level bugs, with 135,000+ internet-exposed instances found across 82 countries. Disabled sandboxes, overly broad tool policies, and shared gateways across trust boundaries require no buggy code to exploit - a correctly functioning but carelessly deployed agent is indistinguishable from a compromised machine.
Prompt injection poses a persistent, long-term threat that is hard to be fully resolved, with techniques spanning indirect injection, marker spoofing, state poisoning, and agent-to-agent exploitation. This cannot be addressed at the model level alone and requires layered system-level defenses, strict capability controls, and protection of persistent memory as an attack surface.
For developers, security in OpenClaw-style agent systems must be a first-class design concern from day one - not a retrofit after growth. This means establishing formal threat models before building, hardening the control plane as an admin API rather than a convenience layer, enforcing immutable privilege inheritance for all spawned subprocesses, applying layered prompt injection defenses and ensuring sandbox enforcement covers every execution path - not just the primary one.
For deployers, managing an OpenClaw-style agent is closer to managing a privileged employee than installing a set-and-forget tool - it demands continuous oversight, periodic audits, and strict access governance. Operators should bind to loopback, run under dedicated non-root accounts in isolated environments, enforce authentication and allow lists on all channels, enable sandboxing with strict tool policies, regularly audit agent state and configurations, and treat third-party extensions with the same scrutiny as untrusted executable code. For non-technical users, the safer choice is to wait for more mature, hardened versions rather than granting an autonomous agent broad access to personal or enterprise accounts today.

FAQs

__What is OpenClaw and why is it raising security concerns? __

OpenClaw is an open-source autonomous AI agent framework designed to act on behalf of users across local systems and external services. Its rapid adoption (growing to more than 300,000 GitHub stars) has outpaced its original security assumptions, which were built for trusted local environments. As deployments became more complex, significant risks emerged, including unauthorized actions, data exposure, and full system compromise.

What are the most critical vulnerabilities identified in the report?

Our analysis highlights several systemic issues: weak authentication in the Gateway control plane, fragile identity binding across messaging platforms, inconsistent enforcement between policy and execution, and insecure filesystem boundaries. In addition, the extension ecosystem introduces large-scale supply chain risks, and prompt injection remains a persistent, unresolved threat that can manipulate agent behavior across workflows.

How do deployment practices impact OpenClaw’s security risk?

Deployment misconfigurations are a major risk driver, often equal to or greater than code-level vulnerabilities. Over 135,000 internet-exposed instances were identified, many with disabled safeguards, overly broad permissions, or shared access across trust boundaries. Even without exploitable bugs, poorly configured agents can behave like fully compromised systems.

Why are third-party extensions and “skills” a major attack vector?

OpenClaw’s extension ecosystem allows agents to expand functionality, but it also introduces supply chain vulnerabilities. Malicious skills, fake installers, and lookalike packages can infiltrate systems at scale. Unlike traditional software, these extensions can influence agent behavior through natural language inputs, making them harder to detect with conventional security tools.

What steps should developers and operators take to mitigate these risks?

Developers should treat security as a foundational design requirement by implementing formal threat models, hardening control planes, enforcing strict privilege boundaries, and applying layered defenses against prompt injection. Operators should manage agents with the same rigor as privileged users, restricting access, enabling sandboxing, auditing configurations regularly, and scrutinizing all third-party integrations. For non-technical users, delaying adoption until the ecosystem matures is the safer approach.

Read the full report here.