Overview

A practical method for deciding when repeated agent instructions should become a skill, how to structure one, and how to test it without treating skills as safety guarantees.

When to use this

You keep pasting the same context, checklist, templates, or workflow into an agent and want the behavior to be reusable across sessions.

Create The Smallest Skill Folder

Start with one concrete skeleton. SKILL.md holds the metadata, trigger description, workflow, output expectations, and verification steps. Use references/ for long background material the agent should load only when needed. Use scripts/ for deterministic checks, conversion, or validation that should not be improvised.

  • SKILL.md: frontmatter, description, workflow, constraints, examples, and final check.
  • references/: source notes, style guides, policy details, or long examples.
  • scripts/: small documented commands with predictable inputs, outputs, and dry-run behavior.

Write The Trigger Test First

The description is the routing layer. Before polishing the workflow, write a trigger test: three requests that should load the skill, three paraphrases that should still load it, and three nearby requests that should not. This catches fuzzy skills before they become annoying.

  • Start with 'Use this skill when...'.
  • Include common trigger phrases.
  • State what the skill does not handle.

Keep Long Material Out Of SKILL.md

Good skills practice progressive disclosure. Put the operating workflow in SKILL.md, then link out to references, templates, and scripts only when the agent needs them. This keeps context focused and makes large skill packages maintainable.

  • Keep workflow instructions in the main file.
  • Move long factual references into references/.
  • Use scripts/ for deterministic checks that agents repeat badly.

Run The Safety Checklist

Skills can contain executable scripts, file instructions, and network-facing guidance. Run the safety checklist before enabling or sharing a skill: no secrets in the folder, no hidden network calls, no destructive defaults, dry-run where possible, and approval required before files, accounts, money, or production systems change.

  • Never store secrets in a skill directory.
  • Require approval for destructive or account-changing actions.
  • Audit third-party skills before installation.

Method

  1. Write the repeated job in one sentence and decide whether the missing piece is procedural knowledge rather than a new tool.
  2. Create the smallest folder: SKILL.md, optional references/ for long context, optional scripts/ for deterministic helpers, and optional assets/ for templates.
  3. Define the trigger in user language so the agent knows when to load the skill and when to ignore it.
  4. Write a trigger test with obvious matches, paraphrased matches, and nearby non-matches.
  5. Move long references, templates, and helper scripts outside the main instructions so the agent can load them only when needed.
  6. Run a safety checklist for scripts, network access, credentials, destructive actions, and untrusted external content.

Before you start

What to collect first

  • Repeated promptStart from instructions you already reuse so the skill solves a real recurring need.
  • Representative tasksUse examples that show when the skill should load and what output it should produce.
  • Known failure modesList ways the agent currently overdoes, misses, or misroutes the work.
  • Reference materialSeparate long examples, templates, and source notes from the main SKILL.md workflow.
  • Safe test casePick a small task that proves the skill works without touching sensitive files or accounts.

Useful files and checks

  • Skill folderCreate a small folder that can hold SKILL.md, references, scripts, and assets clearly.
  • Reference filesKeep long context outside the main instructions so the agent loads it only when needed.
  • Validation checklistCheck triggers, outputs, scripts, secrets, and approval rules before sharing the skill.

Decision points

Should this become a skill or stay a prompt?
Create a skill when the workflow repeats, has stable triggers, needs examples or references, and has a clear verification step. Keep it as a prompt when the instruction is one-off or short-lived.
Should the workflow become a skill or a tool?
Choose a tool when the agent lacks executable capability. Choose a skill when the agent has capability but needs a reliable method for using it.
Should scripts be included?
Include scripts for deterministic parsing, validation, conversion, or repetitive commands. Keep scripts non-interactive, documented, and safe to run in a test case.

Common mistakes

  • Writing a long essay about a topic instead of an agent-actionable workflow.
  • Creating one broad skill that covers several unrelated jobs.
  • Omitting negative triggers, so the skill loads when another skill or simple tool would be better.
  • Shipping scripts without dry-run behavior, clear errors, or permission notes.

Troubleshooting

The skill does not trigger for real user requests.
Rewrite the description around user intent and add two or three phrases the user is likely to say.
The skill triggers too often.
Narrow the description, add out-of-scope language, and split broad workflows into smaller skills.
The agent follows the skill but produces inconsistent output.
Add a required output format, a small example, and a verification checklist before final response.

Sources

This playbook is authored from multiple references. Open the originals to inspect details, examples, and current guidance before adapting it.

Notes

Agent skills can steer tools, scripts, and external context. Review permissions and outputs; a skill is guidance, not a trust guarantee.

Comments

0 comments
No comments yet.