predict.py script uses your trained model to predict folder assignments for unsorted videos, with configurable confidence thresholds and automatic file organization.
How Predictions Work
The inference pipeline:- Load trained model from
artifacts/model.pt(or.pkl) - Load unlabeled embeddings from
artifacts/unlabeled_embeddings.pt - Generate predictions with confidence scores (probability distribution)
- Display top-k predictions per video
- Optionally move files to predicted folders
- Save detailed predictions to
predictions.json
Running Predictions
Ensure Prerequisites
You need:
artifacts/model.ptandartifacts/model_config.jsonfrom trainingartifacts/unlabeled_embeddings.ptfrom feature extraction
Command-Line Arguments
—move
Actually move files to predicted folders (default: false).—threshold
Minimum confidence (0.0 to 1.0) required to auto-assign a folder.--threshold 0.5: Only auto-sort videos the model is reasonably confident about--threshold 0.8: Very conservative, only high-confidence predictions--threshold 0.0: Sort everything (default)
—top-k
Number of predictions to show per video (default: 3).Understanding Confidence Scores
Confidence scores are softmax probabilities (sum to 1.0 across all classes):Interpreting Confidence
| Confidence | Interpretation | Action |
|---|---|---|
| 90-100% | Very confident | Trust the prediction |
| 70-90% | Confident | Usually correct, verify if important |
| 50-70% | Uncertain | Review manually, might be ambiguous |
| <50% | Very uncertain | Definitely review, likely wrong or ambiguous |
High Confidence (>90%)
Example:Moderate Confidence (50-70%)
Example:- Be a funny TikTok (overlap between categories)
- Have features of both categories
- Be mislabeled or genuinely ambiguous
--threshold to skip auto-sorting.
Confused Predictions (Close Split)
Example:- Video shows cooking while traveling
- Weak category definition
- Insufficient training data for this edge case
Predictions Output File
All predictions are saved toartifacts/predictions.json for review:
- Audit predictions before moving files
- Find low-confidence videos for manual review
- Analyze which categories the model confuses
Finding Low-Confidence Predictions
Finding Confused Categories
Confidence Threshold Strategy
Conservative Strategy (High Precision)
- High precision (few errors)
- Many videos skipped (low recall)
- Manual labeling required for uncertain cases
Balanced Strategy
- Good precision (~85-90%)
- High recall (~70-80% sorted)
- Occasional errors on ambiguous videos
Aggressive Strategy (High Recall)
- Lower precision (~80-90%)
- 100% recall (all videos sorted)
- Requires post-sorting review
Active Learning Workflow
-
Initial sorting with high threshold:
-
Review skipped videos (low confidence):
- Manually label uncertain videos via labeling interface
-
Re-extract and retrain with new labels:
- Repeat until all videos are sorted or accuracy plateaus
Each iteration improves the model by teaching it edge cases. After 2-3 cycles, you typically reach 90%+ accuracy.
Troubleshooting
FileNotFoundError: artifacts/unlabeled_embeddings.pt
FileNotFoundError: artifacts/unlabeled_embeddings.pt
No unlabeled videos found during feature extraction. This happens when all videos are already in category folders.Solution: Move some videos to
data/Favorites/videos/ root:All predictions go to one category
All predictions go to one category
Model is biased toward the majority class. Possible causes:
- Severe class imbalance: One category has 90%+ of data
- Weak features: Categories aren’t visually/audibly distinct
- Training issue: Class weighting didn’t work
- Balance your dataset (add more examples to minority classes)
- Verify categories have distinct content
- Check training metrics for signs of mode collapse
Video already exists in target folder
Video already exists in target folder
The video was already sorted (manually or by a previous prediction run).Not an error - the script skips to avoid overwriting. If you want to re-sort:
Predictions seem random (all ~same confidence)
Predictions seem random (all ~same confidence)
Model hasn’t learned meaningful patterns. Check:
- Training accuracy: Was it >60%? If not, model didn’t learn
- Data quality: Are videos correctly labeled?
- Feature extraction: Did it complete successfully?
Batch Processing Large Collections
For 500+ unlabeled videos, process in batches to review incrementally:Next Steps
Labeling Interface
Use the interactive web UI to review predictions and manually label videos