Deep Dive · Security

Is That Skill Safe?

We install AI agent skills the way we used to npm install a random package — except now they carry instructions your agent will actually follow. A skill isn't a library. It's a set of directives you're handing to something that acts on your behalf.

The numbers nobody's looking at

Most teams building skills don't have an SDLC for them yet. Skills get built, dropped in a repo, and installed on implicit trust — no review, no agreed structure, no security evals. The discipline we spent years building for application code isn't being applied to the artifacts our agents execute.

42,000+

skills analyzed across popular marketplaces

NVIDIA, 2026

26%

contained at least one vulnerability

NVIDIA, 2026

5%

showed likely malicious intent

NVIDIA, 2026

2×

more vulnerable when the skill ships executable scripts

NVIDIA, 2026

If you're using skills today — and you should be — someone on your team owns that risk whether they know it or not.

How a skill scanner actually works

A scanner ingests the skill (repo, zip, directory, or a single SKILL.md), then runs two passes. Stage 1 is fast, deterministic pattern-matching. Stage 2 is an optional LLM that reasons about intent to cut false positives. Then it scores the risk and makes an install / don't-install call.

1

Input

Skills come from anywhere.

🐙Git repo 🌐URL 🗜️ZIP archive 📁Local directory 📄SKILL.md file

↓

2

Ingestion & Normalization

Understand and structure the skill.

Parse structureRead the skill layout

ExtractScripts, metadata, prompts

Discover depsMap dependencies

Collect assetsResources & files

↓

3

Analysis

Deep multi-layer analysis of code, deps & behavior.

Pattern detection

Prompt injection
Data exfiltration
Privilege escalation
Excessive agency
Tool misuse
Trigger abuse
Memory poisoning

Code & behavior

AST inspection
exec() / eval() detection
subprocess use
Dynamic imports
Dangerous exec chains

Supply chain

Dependency inspection
OSV vuln lookup
Typosquatting
Unpinned packages
License risk

Intent vs. actual behavior
Hidden instructions & triggers

Tool & memory poisoning
Excessive permissions

Agentic risk & impact
Prunes false positives

↓

4

Findings Aggregation

Consolidate and prioritize what matters.

AggregateDeduplicate · correlate · normalize

Score & classifySeverity · confidence · exploitability

↓

5

Risk Scoring

Quantify overall risk, 0–100.

● Install not recommended
● Install with caution
● Safe to install

↓

6

Outputs & Governance

Actionable results in multiple formats.

⌨️Terminal 📋JSON 📝Markdown 🔗SARIF for CI/CD

Governance decisionUse the results to allow, block, or review the skill before it installs.

The reference engine here is NVIDIA SkillSpector (open-source, Apache-2.0). Static analysis won't catch runtime behavior — but it gives teams a starting pattern, which is more than most have today.

But a scanner can cry wolf — here's a real one

We ran SkillSpector (static-only) against clipify, a perfectly benign video-clipping skill. The raw engine flagged it CRITICAL, 100/100, DO NOT INSTALL. Six of its seven findings were false positives. This is exactly why a raw scanner score is a starting point, not a verdict.

Raw engine

100

CRITICAL · DO NOT INSTALL

MED

MCP Least Privilege — no permissions field

MED

Rogue Agent — matched README prose

LOW

Excessive Agency — "not limited to" in LICENSE

HIGH

Prompt Injection — bytes of preview.png

HIGH

Tool Misuse — bytes of preview.png

HIGH

Tool Misuse — bytes of preview.png

HIGH

Tool Misuse — bytes of preview.png

After a policy gate

40

MEDIUM · PASS (review)

MED

MCP Least Privilege — real: add a permissions field

MED

Rogue Agent — worth a human glance

✓ 5 false positives suppressed:
• 4 HIGH = regex over the raw bytes of a PNG (binary scanned as text)
• 1 = boilerplate language in the LICENSE file

The lesson isn't "scanners are bad." It's that the raw score is the input, not the decision. The value is in the gate on top: suppress the structural noise, encode your threat model, and make the call a human can stand behind.

Scan a skill yourself — right here

Paste a SKILL.md (or any skill prompt) below. This runs a handful of the same pattern categories a real scanner uses — entirely in your browser, nothing is uploaded. It's a teaching toy, not a production gate, but you'll feel the problem in about three seconds.

0 · Safe50 · Caution100 · Critical

Runs 100% client-side — your text never leaves the page. Detects ~8 illustrative patterns; a real engine like SkillSpector runs 64 across 16 categories plus an LLM pass. Don't make trust decisions on this widget.

This is a software-development discipline, not a checkbox

A scanner is the easy 20%. The hard 80% is the program around it: a policy that fits your risk tolerance, a CI gate that blocks the right things, custom rules for your secrets and endpoints, and the review habit that makes it stick. That's what we build with teams.

Assess your agent-stack readiness →

Skill security is one layer of the bigger picture — see AI Governance: Ship Fast Without Shipping Risk.