protectskills/MaliciousAgentSkillsBench
53 stars · Last commit 2026-05-30
A Security Benchmark for Claude Code Agent Skills
README preview
# "Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills  This repository contains a comprehensive security benchmark dataset and evaluation framework for Claude Code Agent Skills. The paper reports a **three-tiered, nested** dataset of __98,380 skills__ from two major platforms (skills.rest and skillsmp.com): **4,287 statically-flagged suspicious candidates** (Tier 2), of which **157 are behaviorally-confirmed malicious skills** (Tier 3). The 157 confirmed skills are a verified **subset of** the 4,287 candidates — not a separate group — and the candidates are themselves a subset of the 98,380-skill snapshot. ## Project Structure ``` MaliciousAgentSkillsBench/ ├── data/ # Benchmark datasets │ ├── malicious_skills.csv # 157 malicious skill samples │ ├── skills_dataset.csv # Ecosystem snapshot; see Data section ├── code/ # Security analysis framework │ ├── helper.py # Interactive reproduction CLI (main entry point) │ ├── analyzer/ # Optional LLM-assisted triage │ ├── crawler/ # Multi-platform data crawler (registry crawler) │ ├── executor/ # Dynamic execution in Docker sandbox (behavioral verification harness) │ ├── scanner/ # Static rule-based security scanner (static analysis rules) │ ├── analysis/ # RQ2 statistics: taxonomy counts + co-occurrence + hypothesis tests