LM Studio

LM Studio is a desktop application that allows you to run large language models locally on your computer with an easy-to-use interface. It’s perfect for users who want local AI with a graphical interface for model management.

Overview

Type: Local provider
Cost: Free
API Key Required: No
Installation Required: Yes
Official Website: https://lmstudio.ai/

Prerequisites

Download LM Studio

Download and install LM Studio from lmstudio.ai for your operating system (Windows, macOS, or Linux).

Download a model

Open LM Studio and use the built-in model browser to download a model. Popular options include:

Llama 3.2 - Fast and capable
Mistral 7B - Excellent general purpose
Phi-3 - Small and efficient
Gemma 2 - Google’s open model

Load the model

In LM Studio, load the downloaded model by selecting it from your collection.

Start the server

Click the “Start Server” button in LM Studio to enable the local API server (default port: 1234).

LM Studio provides a user-friendly interface for downloading, managing, and running models without command-line knowledge.

Setup in AI Providers

Select LM Studio provider

In the AI Providers settings, click Create AI provider and select LM Studio as the provider type.

Configure provider URL

Set the Provider URL to:

http://localhost:1234/v1

(Change the port if you configured LM Studio to use a different one)

Select model

Click the refresh button to fetch the currently loaded model. The model name will appear automatically.

Test the provider

Click Test to verify the connection is working.

Make sure LM Studio’s server is running before using it with AI Providers. The server must be started manually each time you open LM Studio.

Recommended Models

Model Family	Size	RAM Required	Best For
Llama 3.2	3B	~4GB	Fast, balanced performance
Phi-3 Mini	3.8B	~4GB	Efficient, great quality
Gemma 2	2B-9B	4-12GB	Google quality, various sizes
Mistral	7B	~8GB	High quality general purpose
Llama 3.1	8B	~10GB	Excellent instruction following
Qwen 2.5	7B	~8GB	Strong multilingual support

LM Studio shows you model sizes and estimated RAM requirements when browsing models, making it easy to choose one that fits your hardware.

Key Features

Graphical Model Management

Browse and download models from Hugging Face
Visual interface for model selection
Easy model updates and management
Model performance metrics

Quantization Options

LM Studio offers models in different quantization levels:

Q4 - Smaller file size, faster, slightly lower quality
Q5 - Balanced option
Q8 - Higher quality, larger file size
FP16 - Full precision, best quality, largest size

Hardware Acceleration

Automatic GPU detection and usage
CPU-only mode for systems without compatible GPUs
Metal support for Apple Silicon Macs
CUDA support for NVIDIA GPUs

Chat Interface

Test models directly in LM Studio before using them with Obsidian:

Interactive chat interface
Prompt templates
Parameter tuning
Performance monitoring

Troubleshooting

Server Not Running

If you can’t connect to LM Studio:

Open LM Studio application
Load a model from your collection
Click “Start Server” in the Local Server tab
Verify the port number matches your AI Providers settings

Model Not Loading

If the model fails to load:

Check you have enough RAM available
Try a smaller model or lower quantization
Close other applications to free up memory
Check LM Studio logs for error messages

Slow Performance

Performance depends on your hardware:

With GPU: Much faster, can handle larger models
CPU only: Slower, stick to smaller models (3B-7B)
RAM: More RAM allows larger models and longer contexts

To improve performance:

Use a smaller model
Enable GPU acceleration if available
Choose a lower quantization (Q4 instead of Q8)
Reduce context length settings

Connection Refused

If AI Providers can’t connect:

Verify LM Studio server is running (green indicator)
Check the port number in both LM Studio and AI Providers
Ensure no firewall is blocking localhost connections
Try visiting http://localhost:1234 in your browser

Advanced Configuration

Custom Port

To use a different port:

In LM Studio, go to Settings → Server
Change the port number
Update the Provider URL in AI Providers accordingly

Model Parameters

Adjust in LM Studio’s interface:

Context Length - How much text the model can process
Temperature - Control randomness (0.0-2.0)
Top P - Nucleus sampling parameter
Repeat Penalty - Reduce repetition
GPU Layers - How much to offload to GPU

Prompt Templates

LM Studio supports various prompt templates:

ChatML
Llama
Alpaca
Vicuna
Custom formats

The correct template is usually auto-detected based on the model.

Best Practices

Match model to hardware: Choose model size based on your RAM
Use GPU if available: Dramatically faster than CPU-only
Start small: Test with smaller models first
Keep server running: More convenient than starting/stopping
Update models: Check for updated versions periodically
Monitor performance: Use LM Studio’s built-in metrics

Comparison with Ollama

LM Studio advantages:

Graphical interface
Easier for beginners
Visual model browsing
Built-in chat interface

Ollama advantages:

Command-line efficiency
Better for automation
Lower overhead
More scripting options

Both LM Studio and Ollama are excellent choices for local AI. Choose based on your preference for graphical (LM Studio) vs command-line (Ollama) interfaces.

Privacy and Offline Use

LM Studio offers complete privacy:

All processing happens locally
No data sent to external servers
Works completely offline
No API keys or accounts needed

Perfect for:

Sensitive documents
Offline environments
Privacy-conscious users
Air-gapped systems

Get Started

User Guide

Providers

Overview

Prerequisites

Setup in AI Providers

Recommended Models

Key Features

Graphical Model Management

Quantization Options

Hardware Acceleration

Chat Interface

Troubleshooting

Server Not Running

Model Not Loading

Slow Performance

Connection Refused

Advanced Configuration

Custom Port

Model Parameters

Prompt Templates

Best Practices

Comparison with Ollama

Privacy and Offline Use

Build docs developers (and LLMs) love

Get Started

User Guide

Providers

​Overview

​Prerequisites

​Setup in AI Providers

​Recommended Models

​Key Features

​Graphical Model Management

​Quantization Options

​Hardware Acceleration

​Chat Interface

​Troubleshooting

​Server Not Running

​Model Not Loading

​Slow Performance

​Connection Refused

​Advanced Configuration

​Custom Port

​Model Parameters

​Prompt Templates

​Best Practices

​Comparison with Ollama

​Privacy and Offline Use

Build docs developers (and LLMs) love

Overview

Prerequisites

Setup in AI Providers

Recommended Models

Key Features

Graphical Model Management

Quantization Options

Hardware Acceleration

Chat Interface

Troubleshooting

Server Not Running

Model Not Loading

Slow Performance

Connection Refused

Advanced Configuration

Custom Port

Model Parameters

Prompt Templates

Best Practices

Comparison with Ollama

Privacy and Offline Use