Deep Dive · Security
Is That Skill Safe?
We install AI agent skills the way we used to npm install a random package — except now they carry instructions your agent will actually follow. A skill isn't a library. It's a set of directives you're handing to something that acts on your behalf.
The numbers nobody's looking at
Most teams building skills don't have an SDLC for them yet. Skills get built, dropped in a repo, and installed on implicit trust — no review, no agreed structure, no security evals. The discipline we spent years building for application code isn't being applied to the artifacts our agents execute.
How a skill scanner actually works
A scanner ingests the skill (repo, zip, directory, or a single SKILL.md), then runs two passes. Stage 1 is fast, deterministic pattern-matching. Stage 2 is an optional LLM that reasons about intent to cut false positives. Then it scores the risk and makes an install / don't-install call.
Pattern detection
- Prompt injection
- Data exfiltration
- Privilege escalation
- Excessive agency
- Tool misuse
- Trigger abuse
- Memory poisoning
Code & behavior
- AST inspection
- exec() / eval() detection
- subprocess use
- Dynamic imports
- Dangerous exec chains
Supply chain
- Dependency inspection
- OSV vuln lookup
- Typosquatting
- Unpinned packages
- License risk
- Intent vs. actual behavior
- Hidden instructions & triggers
- Tool & memory poisoning
- Excessive permissions
- Agentic risk & impact
- Prunes false positives
● Install with caution
● Safe to install
The reference engine here is NVIDIA SkillSpector (open-source, Apache-2.0). Static analysis won't catch runtime behavior — but it gives teams a starting pattern, which is more than most have today.
But a scanner can cry wolf — here's a real one
We ran SkillSpector (static-only) against clipify, a perfectly benign video-clipping skill. The raw engine flagged it CRITICAL, 100/100, DO NOT INSTALL. Six of its seven findings were false positives. This is exactly why a raw scanner score is a starting point, not a verdict.
• 4 HIGH = regex over the raw bytes of a PNG (binary scanned as text)
• 1 = boilerplate language in the LICENSE file
Scan a skill yourself — right here
Paste a SKILL.md (or any skill prompt) below. This runs a handful of the same pattern categories a real scanner uses — entirely in your browser, nothing is uploaded. It's a teaching toy, not a production gate, but you'll feel the problem in about three seconds.
Runs 100% client-side — your text never leaves the page. Detects ~8 illustrative patterns; a real engine like SkillSpector runs 64 across 16 categories plus an LLM pass. Don't make trust decisions on this widget.
This is a software-development discipline, not a checkbox
A scanner is the easy 20%. The hard 80% is the program around it: a policy that fits your risk tolerance, a CI gate that blocks the right things, custom rules for your secrets and endpoints, and the review habit that makes it stick. That's what we build with teams.
Assess your agent-stack readiness →