prompt-injection - TrustStrike Labs

vulnerability

2026-04-21

Google patches Antigravity IDE prompt injection RCE - and Claude GitHub Actions can be tricked by spoofed Git metadata

Two related stories show AI-powered developer tools becoming a fresh attack surface. First, Pillar Security disclosed a now-patched vulnerability in Google's agentic IDE Antigravity that allowed prompt injection to escape the Strict Mode sandbox and achieve arbitrary code execution. The flaw combined Antigravity's file-creation capability with insufficient input sanitization in its find_by_name tool: injecting the -X (exec-batch) flag via the Pattern parameter forced the underlying fd utility to execute arbitrary binaries against workspace files. An attacker could stage a malicious script then trigger it through a seemingly legitimate search - no user interaction needed once the prompt injection lands. The attack can be delivered via indirect prompt injection: a user pulls a harmless-looking file from an untrusted source containing hidden comments that instruct the AI agent to stage and trigger the exploit. Google patched on February 28. Second, Manifold Security researchers showed a Claude-powered GitHub Actions workflow (claude-code-action) can be tricked into approving and merging malicious pull requests by setting Git's user.name and user.email to match a trusted developer (in the demo: Andrej Karpathy). On first submission Claude flagged for manual review. On resubmission, Claude approved it - the AI overrode its own earlier judgment on retry. The common thread: AI agents cannot treat attacker-controllable metadata as a trust signal, and non-determinism across retries means you cannot build a security control on an AI that changes its mind.

google-antigravity claude-code-action prompt-injection ai-supply-chain ide-security indirect-injection

Check: If your team uses AI coding agents (Antigravity, Cursor with autonomous modes, Claude Code, claude-code-action, or similar), audit what those agents can do without human approval - and tighten the boundaries.
Affected: Development teams using Google Antigravity before February 28 patch. Repositories using Claude's claude-code-action or similar AI code review automation, especially if author-identity metadata influences review decisions. Any AI-agent workflow that auto-approves or auto-merges based on perceived author trust. Codebases that pull external content into AI agent context (READMEs, docs, dependencies) without treating it as untrusted input.
Fix: For Antigravity, confirm you're on the patched February 28+ build. For claude-code-action and similar workflows, configure them to never auto-merge based on author identity signals - require human review for every merge to protected branches regardless of PR author. Treat Git author metadata as user-controllable and untrusted in any AI agent prompt context. For AI agents that might retry or re-evaluate the same decision, pin the first response rather than accepting an optimistic retry (don't let an agent 'change its mind' in favor of the attacker). Review every input channel your AI agents consume - PR descriptions, commit messages, external dependencies, documentation - and assume each can contain hidden instructions.