Best Practices for Skill Development
These best practices are drawn from Anthropic’s production skills and the skill-creator skill that helps build new skills.Writing Effective Instructions
Explain the Why, Not Just the What
Modern LLMs have strong theory of mind and perform better when they understand the reasoning behind instructions.“Try to explain to the model why things are important in lieu of heavy-handed musty MUSTs. Use theory of mind and try to make the skill general and not super-narrow to specific examples.”
Use Imperative Form
Write instructions as direct commands for clarity:Provide Concrete Examples
Examples clarify expectations better than abstract descriptions:Skill Description Best Practices
The description field is your skill’s primary triggering mechanism. Write it carefully.Include Both What and When
From skill-creator:“Include both what the skill does AND specific contexts for when to use it. All ‘when to use’ info goes here, not in the body.”
Be Slightly “Pushy”
From skill-creator:“Currently Claude has a tendency to ‘undertrigger’ skills — to not use them when they’d be useful. To combat this, please make the skill descriptions a little bit ‘pushy’.”Example transformation: Before:
List Trigger Keywords
Explicitly mention terms that should trigger the skill:Include Negative Triggers
Specify when NOT to use the skill to prevent false positives:Progressive Disclosure
Keep SKILL.md focused and move details to bundled resources.Keep SKILL.md Under 500 Lines
From skill-creator:“Keep SKILL.md under 500 lines; if you’re approaching this limit, add an additional layer of hierarchy along with clear pointers about where the model using the skill should go next to follow up.”
Add Clear Navigation
When using references, provide explicit guidance:Use Tables of Contents
For reference files >300 lines, include a TOC:Bundle Scripts Strategically
Look for Repeated Work
From skill-creator:“Look for repeated work across test cases. Read the transcripts from the test runs and notice if the subagents all independently wrote similar helper scripts or took the same multi-step approach to something. If all 3 test cases resulted in the subagent writing acreate_docx.pyor abuild_chart.py, that’s a strong signal the skill should bundle that script.”
When to Bundle vs. Generate
Bundle a Script When...
Bundle a Script When...
- The same code appears in multiple test runs
- The operation must be deterministic (parsing, validation)
- Performance matters (native code is faster)
- The logic is complex with edge cases
Let Claude Generate When...
Let Claude Generate When...
- The code varies based on user requirements
- It’s a simple, one-time operation
- The task requires understanding user context
- Flexibility is more important than consistency
Domain Organization
When supporting multiple frameworks or platforms:Security and Safety
Principle of Lack of Surprise
From skill-creator:“Skills must not contain malware, exploit code, or any content that could compromise system security. A skill’s contents should not surprise the user in their intent if described. Don’t go along with requests to create misleading skills or skills designed to facilitate unauthorized access, data exfiltration, or other malicious activities.”
Validate Inputs
If your skill processes user data, include validation:Handle Errors Gracefully
Testing and Iteration
Start with 2-3 Test Cases
From skill-creator:“Come up with 2-3 realistic test prompts — the kind of thing a real user would actually say.”Test cases should be:
- Realistic - What users will actually ask
- Specific - Include concrete details (file names, data, context)
- Representative - Cover different aspects of the skill
Generalize from Feedback
From skill-creator:“We’re trying to create skills that can be used a million times across many different prompts. Here you and the user are iterating on only a few examples. But if the skill works only for those examples, it’s useless. Rather than put in fiddly overfitty changes, or oppressively constrictive MUSTs, try branching out and using different metaphors, or recommending different patterns of working.”
Remove What Doesn’t Help
From skill-creator:“Keep the prompt lean. Remove things that aren’t pulling their weight. Make sure to read the transcripts, not just the final outputs — if it looks like the skill is making the model waste a bunch of time doing things that are unproductive, you can try getting rid of the parts of the skill that are making it do that.”
Communicating Clearly
Adapt to User Expertise
From skill-creator:“The skill creator is liable to be used by people across a wide range of familiarity with coding jargon. Pay attention to context cues to understand how to phrase your communication!”
“In the default case: ‘evaluation’ and ‘benchmark’ are borderline, but OK. For ‘JSON’ and ‘assertion’ you want to see serious cues from the user that they know what those things are before using them without explaining them.”
Define Structure Clearly
When specifying output formats:Version Control and Distribution
Include a LICENSE.txt
If sharing your skill, include clear licensing:- Apache 2.0 - Open source, permissive
- MIT - Open source, very permissive
- Proprietary - Closed source, custom terms
Package for Distribution
Use the skill-creator’s packaging script:.skill file that users can install easily.
Common Pitfalls to Avoid
Skill Quality Checklist
Before considering a skill complete:-
Frontmatter
- Name is lowercase with hyphens
- Description includes both what and when
- Description lists key trigger terms
- Description specifies negative triggers if relevant
- License specified if distributing
-
Structure
- SKILL.md is under 500 lines (or has clear reason to exceed)
- Large docs (>300 lines) moved to references/
- Scripts bundled for repeated/deterministic tasks
- Assets included for templates/static files
-
Instructions
- Written in imperative form
- Explains why, not just what
- Includes concrete examples
- Provides clear navigation to references
- Defines output formats explicitly
-
Testing
- Tested with 2-3 realistic prompts
- Generalizes beyond specific test cases
- Handles errors gracefully
- Performs well on variations of tasks
-
Security
- No malicious code or exploits
- Intent matches description (no surprises)
- Validates inputs appropriately
- Sanitizes user-provided data
Next Steps
Overview
Review the skill creation process
Skill Structure
Understand how to organize your skill files