Why You Need a Token
The defaultdocker-compose.yml configuration uses Llama 3.2 1B Instruct as the LLM, which requires only 16GB of GPU memory. This model is freely available but is gated, meaning you need to:
- Accept the model’s usage conditions
- Authenticate with a Hugging Face token
Setup Steps
Create a Hugging Face Account
If you don’t already have one, sign up for Hugging Face.
Accept Model Conditions
Visit the model page and accept the usage conditions:
- Llama 3.2 1B Instruct (default in docker-compose.yml)
- Or Mistral Small 3.2 24B (alternative, better quality)
- Or Gemma 3 12B (alternative)
Create an Access Token
- Go to Settings → Access Tokens
- Click “New token”
- Select Fine-grained token type
- Grant permission: “Read access to contents of all public gated repos you can access”
- Copy the token (starts with
hf_)
Add Token to Environment
Add the token to your shell configuration file (Reload your shell configuration:
~/.bashrc, ~/.zshrc, or equivalent):Using the Token
Docker Compose
Thedocker-compose.yml file automatically passes the token to services that need it:
docker compose up.
Dockerless Deployment
For dockerless setups, the token is automatically read from your environment when you run the startup scripts:Troubleshooting
Token not found error
Token not found error
If you see errors about missing authentication:
- Verify the token is set:
echo $HUGGING_FACE_HUB_TOKEN - Make sure you’ve reloaded your shell after adding it to
.bashrc - For Docker, ensure the token was set before running
docker compose up
Access denied to gated model
Access denied to gated model
If you get permission errors:
- Confirm you’ve accepted the model conditions on Hugging Face
- Wait a few minutes for the permissions to propagate
- Verify your token has “Read access to contents of all public gated repos”
Token with write permissions
Token with write permissions
If you accidentally created a token with write access:
- Go to Settings → Access Tokens
- Revoke the old token
- Create a new fine-grained token with read-only access
- Update your environment variable
Security Best Practices
- Use fine-grained tokens with minimal permissions
- Create separate tokens for development and production
- Rotate tokens periodically
- Revoke tokens immediately if compromised
- Never use write-access tokens for public deployments