Skip to main content
LM Studio is a desktop application that lets you run large language models locally with an easy-to-use interface. Page Assist can connect to LM Studio through its OpenAI-compatible API server.

Prerequisites

  • LM Studio installed on your system
  • At least one model downloaded in LM Studio
  • LM Studio’s local server enabled

Installation

If you haven’t installed LM Studio yet:
1

Download LM Studio

Visit lmstudio.ai and download the installer for your operating system.
2

Install LM Studio

Run the installer and follow the installation instructions.
3

Download a Model

Open LM Studio and browse the model library. Download a model suitable for your hardware:
  • Small (3B-7B): Good for 8-16GB RAM
  • Medium (13B): Requires 16-32GB RAM
  • Large (30B+): Requires 32GB+ RAM
Popular choices:
  • Llama 3.2 (3B or 8B)
  • Mistral 7B
  • Phi-3 (3.8B)
4

Load the Model

In LM Studio, click on a downloaded model to load it into memory.

Enabling the API Server

LM Studio must have its local server running for Page Assist to connect:
1

Open Local Server Tab

In LM Studio, navigate to the “Local Server” or “Developer” tab (location varies by version).
2

Start the Server

Click “Start Server”. The default address is:
http://localhost:1234/v1
3

Verify Server is Running

You should see a confirmation message that the server is running and listening on port 1234.

Connecting to Page Assist

Once LM Studio’s server is running, connect it to Page Assist:
1

Open Page Assist Settings

Click the Page Assist icon in your browser toolbar, then click the Settings icon.
2

Navigate to OpenAI Compatible API

Go to the “OpenAI Compatible API” tab.
3

Add Provider

Click the “Add Provider” button.
4

Select LM Studio

Choose “LMStudio” from the provider dropdown menu.
5

Enter API URL

Enter the LM Studio API URL. The default is:
http://localhost:1234/v1
6

Save Configuration

Click “Save”. Page Assist will automatically fetch available models from LM Studio.

Model Management

Automatic Model Detection

Page Assist automatically detects models that are currently loaded in LM Studio. The model must be loaded in LM Studio before it appears in Page Assist.

Switching Models

To use a different model:
  1. In LM Studio, stop the current model
  2. Load the desired model
  3. Refresh Page Assist or restart the chat
  4. The new model should appear in the model selector

Model Not Appearing

If a model doesn’t appear in Page Assist:
  1. Ensure the model is loaded in LM Studio (not just downloaded)
  2. Verify the API server is running
  3. Check the connection URL in Page Assist settings
  4. Try refreshing the Page Assist interface

Configuration Options

Custom Server Port

To change LM Studio’s server port:
  1. In LM Studio’s server settings, change the port number
  2. Update the URL in Page Assist to match:
    http://localhost:[NEW_PORT]/v1
    

Server Authentication

LM Studio’s local server typically doesn’t require authentication. If you’ve configured API keys:
  1. In Page Assist provider settings, expand advanced options
  2. Enter your API key in the authentication field
  3. Save the configuration

Custom Headers

For advanced configurations requiring custom headers:
  1. In provider settings, find “Custom Headers” option
  2. Add required headers as key-value pairs
  3. Save your configuration

Troubleshooting

Connection Failed

1

Verify Server is Running

Check that LM Studio’s local server shows as “Running”.
2

Check Port Number

Ensure the port in Page Assist matches LM Studio’s server port (default: 1234).
3

Test URL in Browser

Navigate to http://localhost:1234/v1/models in your browser. You should see a JSON response with model information.
4

Check Firewall

Ensure your firewall isn’t blocking localhost connections on port 1234.
5

Restart LM Studio Server

Stop and restart the server in LM Studio.

Model Not Loaded

Error messages like “No model loaded”:
  1. Load a model in LM Studio by clicking on it
  2. Wait for the model to fully load into memory
  3. Verify “Model Loaded” appears in LM Studio
  4. Try sending a message again

Slow Response Times

If responses are slow:
  • Use a smaller model: Switch to a 3B or 7B model
  • Adjust GPU layers: In LM Studio, increase GPU layer count if you have a GPU
  • Check system resources: Close other applications to free up RAM
  • Use quantized models: GGUF Q4 or Q5 quantizations are faster

Out of Memory Errors

  1. Use a smaller model
  2. Close other applications
  3. Reduce context length in LM Studio settings
  4. Restart LM Studio to clear memory

Performance Optimization

GPU Acceleration

If you have a compatible GPU:
  1. In LM Studio, go to settings
  2. Enable GPU acceleration
  3. Set GPU layer count (higher = more GPU usage)
  4. Reload your model

Context Length

Adjust context window size based on your needs:
  • Shorter contexts (2048-4096): Faster, less memory
  • Longer contexts (8192+): Slower, more memory, better for long conversations

Batch Size

In LM Studio settings:
  • Increase batch size for faster generation (requires more memory)
  • Decrease if you encounter memory issues

Multiple LM Studio Instances

You can run multiple LM Studio instances on different ports:
  1. Start first instance on default port 1234
  2. Start second instance on a different port (e.g., 1235)
  3. Add each as a separate provider in Page Assist

Comparison with Ollama

FeatureLM StudioOllama
UIGraphicalCommand-line
Model ManagementBuilt-in browserCLI commands
Ease of UseVery EasyModerate
Model FormatGGUFGGUF
APIOpenAI-compatibleOllama & OpenAI-compatible
PlatformWindows, Mac, LinuxWindows, Mac, Linux

Best Practices

  1. Keep LM Studio Updated: Regular updates improve performance and compatibility
  2. Monitor Memory: Watch RAM usage and choose models accordingly
  3. One Model at a Time: Load only one model at a time to save resources
  4. Persistent Server: Keep the API server running while using Page Assist
  5. Model Selection: Test different models to find the best balance of quality and speed

Next Steps

Build docs developers (and LLMs) love