Prerequisites
Make sure you have Python 3.10+ and Skill Lab installed.
Your First Evaluation
Let’s evaluate a skill using Skill Lab’s static analysis.Create a sample skill
Create a directory with a Create You’ll see a rich-formatted report with:
SKILL.md file:SKILL.md with this content:SKILL.md
Usage
Use this skill when you need to demonstrate basic functionality.- Quality score (0-100)
- Check results by dimension
- Passed and failed checks
- Suggestions for improvement
Common Commands
Now that you’ve run your first evaluation, try these commands:Understanding the Output
Here’s what the evaluation report includes:Quality Score
Quality Score
A weighted 0-100 score based on check results:
- 90-100: Excellent quality
- 70-89: Good quality, minor improvements needed
- 50-69: Fair quality, several issues to address
- Below 50: Needs significant improvement
Check Dimensions
Check Dimensions
Checks are organized into 4 dimensions:
- Structure: File existence, frontmatter format, standard fields
- Naming: Skill name format, directory matching
- Description: Required fields, max length, non-empty
- Content: Examples, line budget, reference validation
Severity Levels
Severity Levels
Each check has a severity:
- Error: Must fix for spec compliance
- Warning: Important quality issues
- Info: Suggestions for best practices
Advanced: Generate Trigger Tests
Generate LLM-powered trigger test cases:Generate test cases
.skill-lab/tests/triggers.yaml with ~13 test cases across 4 trigger types.Review the tests
Open
.skill-lab/tests/triggers.yaml to see:- Explicit triggers (direct skill name)
- Implicit triggers (need description)
- Contextual triggers (realistic prompts)
- Negative triggers (should NOT activate)
Run the tests (optional)
Requires Claude CLI to be installed.
Practical Examples
- CI/CD Validation
- Quality Tracking
- Export for Agents
- Batch Processing
Use in GitHub Actions or other CI:
.github/workflows/validate-skills.yml
Next Steps
Now that you’ve run your first evaluation, explore more features:Static Analysis Guide
Deep dive into the 19 static checks
Trigger Testing Guide
Learn about trigger testing in detail
Quality Scoring
Understand how scores are calculated
Output Formats
Master console and JSON output
Skill Format
Learn the SKILL.md format specification
Check Catalog
Browse all available checks
Common Issues
No SKILL.md found
No SKILL.md found
Make sure you’re in a directory with a
SKILL.md file, or specify the path:Frontmatter parsing errors
Frontmatter parsing errors
Ensure your frontmatter is valid YAML:Common issues:
- Missing closing
--- - Invalid YAML syntax
- Non-string field values
Low quality score
Low quality score
Run with This shows both passing and failing checks, helping you identify improvements.
--verbose to see all checks: