Prerequisites
- LM Studio installed on your system
- At least one model downloaded in LM Studio
- LM Studio’s local server enabled
Installation
If you haven’t installed LM Studio yet:Download LM Studio
Visit lmstudio.ai and download the installer for your operating system.
Download a Model
Open LM Studio and browse the model library. Download a model suitable for your hardware:
- Small (3B-7B): Good for 8-16GB RAM
- Medium (13B): Requires 16-32GB RAM
- Large (30B+): Requires 32GB+ RAM
- Llama 3.2 (3B or 8B)
- Mistral 7B
- Phi-3 (3.8B)
Enabling the API Server
LM Studio must have its local server running for Page Assist to connect:Open Local Server Tab
In LM Studio, navigate to the “Local Server” or “Developer” tab (location varies by version).
Connecting to Page Assist
Once LM Studio’s server is running, connect it to Page Assist:Open Page Assist Settings
Click the Page Assist icon in your browser toolbar, then click the Settings icon.
Model Management
Automatic Model Detection
Page Assist automatically detects models that are currently loaded in LM Studio. The model must be loaded in LM Studio before it appears in Page Assist.Switching Models
To use a different model:- In LM Studio, stop the current model
- Load the desired model
- Refresh Page Assist or restart the chat
- The new model should appear in the model selector
Model Not Appearing
If a model doesn’t appear in Page Assist:- Ensure the model is loaded in LM Studio (not just downloaded)
- Verify the API server is running
- Check the connection URL in Page Assist settings
- Try refreshing the Page Assist interface
Configuration Options
Custom Server Port
To change LM Studio’s server port:- In LM Studio’s server settings, change the port number
- Update the URL in Page Assist to match:
Server Authentication
LM Studio’s local server typically doesn’t require authentication. If you’ve configured API keys:- In Page Assist provider settings, expand advanced options
- Enter your API key in the authentication field
- Save the configuration
Custom Headers
For advanced configurations requiring custom headers:- In provider settings, find “Custom Headers” option
- Add required headers as key-value pairs
- Save your configuration
Troubleshooting
Connection Failed
Test URL in Browser
Navigate to
http://localhost:1234/v1/models in your browser. You should see a JSON response with model information.Model Not Loaded
Error messages like “No model loaded”:- Load a model in LM Studio by clicking on it
- Wait for the model to fully load into memory
- Verify “Model Loaded” appears in LM Studio
- Try sending a message again
Slow Response Times
If responses are slow:- Use a smaller model: Switch to a 3B or 7B model
- Adjust GPU layers: In LM Studio, increase GPU layer count if you have a GPU
- Check system resources: Close other applications to free up RAM
- Use quantized models: GGUF Q4 or Q5 quantizations are faster
Out of Memory Errors
- Use a smaller model
- Close other applications
- Reduce context length in LM Studio settings
- Restart LM Studio to clear memory
Performance Optimization
GPU Acceleration
If you have a compatible GPU:- In LM Studio, go to settings
- Enable GPU acceleration
- Set GPU layer count (higher = more GPU usage)
- Reload your model
Context Length
Adjust context window size based on your needs:- Shorter contexts (2048-4096): Faster, less memory
- Longer contexts (8192+): Slower, more memory, better for long conversations
Batch Size
In LM Studio settings:- Increase batch size for faster generation (requires more memory)
- Decrease if you encounter memory issues
Multiple LM Studio Instances
You can run multiple LM Studio instances on different ports:- Start first instance on default port 1234
- Start second instance on a different port (e.g., 1235)
- Add each as a separate provider in Page Assist
Comparison with Ollama
| Feature | LM Studio | Ollama |
|---|---|---|
| UI | Graphical | Command-line |
| Model Management | Built-in browser | CLI commands |
| Ease of Use | Very Easy | Moderate |
| Model Format | GGUF | GGUF |
| API | OpenAI-compatible | Ollama & OpenAI-compatible |
| Platform | Windows, Mac, Linux | Windows, Mac, Linux |
Best Practices
- Keep LM Studio Updated: Regular updates improve performance and compatibility
- Monitor Memory: Watch RAM usage and choose models accordingly
- One Model at a Time: Load only one model at a time to save resources
- Persistent Server: Keep the API server running while using Page Assist
- Model Selection: Test different models to find the best balance of quality and speed
Next Steps
- Learn about Configuration Settings
- Explore Knowledge Base features
- Set up additional providers