LLM providers

NoteWise routes all LLM calls through LiteLLM, which means any model string that LiteLLM supports will work — not just the providers listed here. The table below covers the eight natively supported providers with first-class API key handling.

The default model is gemini/gemini-2.5-flash. Google Gemini has a free tier, so it’s the easiest way to get started without a credit card. Grab a key at aistudio.google.com.

Supported providers

Provider	Env var	Example model string
Google Gemini	`GEMINI_API_KEY`	`gemini/gemini-2.5-flash`
OpenAI	`OPENAI_API_KEY`	`gpt-4o`
Anthropic	`ANTHROPIC_API_KEY`	`claude-3-5-sonnet-20241022`
Groq	`GROQ_API_KEY`	`groq/llama3-70b-8192`
xAI	`XAI_API_KEY`	`xai/grok-2`
Mistral	`MISTRAL_API_KEY`	`mistral/mistral-large-latest`
Cohere	`COHERE_API_KEY`	`command-r-plus`
DeepSeek	`DEEPSEEK_API_KEY`	`deepseek/deepseek-chat`

Setting your API key

Setup wizard (recommended)
Environment variable
Config file

Run the interactive wizard once to configure your provider, model, and API key:

notewise setup

The wizard prompts you to choose a provider, select a model from a live list fetched from LiteLLM, and enter your API key. Everything is saved to ~/.notewise/config.env with owner-only file permissions (0o600).

Set the relevant env var directly. NoteWise reads it at startup and syncs it into the environment for LiteLLM:

export GEMINI_API_KEY="your_key_here"
notewise process "https://youtube.com/watch?v=VIDEO_ID"

Environment variables always take precedence over ~/.notewise/config.env.

Add the key manually to ~/.notewise/config.env:

# ~/.notewise/config.env
DEFAULT_MODEL=gemini/gemini-2.5-flash
GEMINI_API_KEY=your_key_here

Run notewise config to verify the key is loaded (the value will be masked).

Overriding the model per run

Use --model (or -m) to change the model for a single invocation without touching your config:

notewise process "URL" --model claude-3-5-sonnet-20241022
notewise process "URL" --model gpt-4o
notewise process "URL" --model groq/llama3-70b-8192

The corresponding API key must already be set in the environment or config.env — --model only changes which model is used, not which key is read.

LiteLLM model string format

LiteLLM uses a provider/model-name convention for most providers. The prefix tells LiteLLM which API endpoint and authentication header to use:

gemini/gemini-2.5-flash      ← Google Gemini
groq/llama3-70b-8192         ← Groq
xai/grok-2                   ← xAI
mistral/mistral-large-latest ← Mistral
deepseek/deepseek-chat        ← DeepSeek

Some providers (OpenAI, Anthropic, Cohere) use unprefixed model names:

gpt-4o
claude-3-5-sonnet-20241022
command-r-plus

NoteWise detects the provider from the model string prefix (or the model name itself for OpenAI/Anthropic/Cohere) and determines which API key env var to validate. If you use a model string that doesn’t match any known pattern — for example, a locally-hosted Ollama model — NoteWise will still pass it straight to LiteLLM; no API key check is performed.

Provider details

Google Gemini

Default provider. Gemini 2.5 Flash offers a generous free tier and is fast enough for real-time note generation on most videos.

Setting	Value
Env var	`GEMINI_API_KEY`
Default model	`gemini/gemini-2.5-flash`
Key source	aistudio.google.com/app/apikey

Other available Gemini models (from the setup wizard):

gemini/gemini-2.5-flash
gemini/gemini-2.5-flash-lite
gemini/gemini-2.5-pro

OpenAI

Setting	Value
Env var	`OPENAI_API_KEY`
Example model	`gpt-4o`
Key source	platform.openai.com/api-keys

OpenAI reasoning models (o1, o3, o4 series) are also supported. Note that reasoning models do not accept a temperature parameter — NoteWise passes the configured temperature for all models, so you may see a warning from LiteLLM if you use a reasoning model with a non-default temperature.Available models include:

gpt-4o-mini
gpt-4o
o3-mini

Anthropic

Setting	Value
Env var	`ANTHROPIC_API_KEY`
Example model	`claude-3-5-sonnet-20241022`
Key source	console.anthropic.com/settings/keys

Claude models accept max_tokens and temperature as normal. If MAX_TOKENS is not set in config, NoteWise lets LiteLLM use the model’s default.Available models include:

claude-haiku-4-5-20251001
claude-sonnet-4-5-20250929
claude-4-opus-20250514

Groq

Setting	Value
Env var	`GROQ_API_KEY`
Example model	`groq/llama3-70b-8192`
Key source	console.groq.com/keys

Groq’s inference API is very fast and well-suited to long transcripts. Model strings must include the groq/ prefix.Available models include:

groq/llama-3.1-8b-instant
groq/llama-3.3-70b-versatile
groq/meta-llama/llama-4-scout-17b-16e-instruct

xAI

Setting	Value
Env var	`XAI_API_KEY`
Example model	`xai/grok-2`
Key source	console.x.ai

Model strings must include the xai/ prefix.Available models include:

xai/grok-3
xai/grok-3-mini-latest
xai/grok-4-0709

Mistral

Setting	Value
Env var	`MISTRAL_API_KEY`
Example model	`mistral/mistral-large-latest`
Key source	console.mistral.ai/api-keys

Model strings must include the mistral/ prefix.Available models include:

mistral/mistral-small-latest
mistral/mistral-medium-latest
mistral/mistral-large-latest

Cohere

Setting	Value
Env var	`COHERE_API_KEY`
Example model	`command-r-plus`
Key source	dashboard.cohere.com/api-keys

Cohere models use unprefixed names (no cohere/ prefix required).Available models include:

command-a-03-2025
command-r-plus-08-2024
command-r-08-2024

DeepSeek

Setting	Value
Env var	`DEEPSEEK_API_KEY`
Example model	`deepseek/deepseek-chat`
Key source	platform.deepseek.com/api_keys

Model strings must include the deepseek/ prefix.Available models include:

deepseek/deepseek-chat
deepseek/deepseek-v3
deepseek/deepseek-reasoner

Generation parameters

Two parameters affect LLM output quality and length, and can be set globally in config or overridden per run:

Parameter	Config key	CLI flag	Default	Range
Temperature	`TEMPERATURE`	`--temperature` / `-t`	`0.7`	`0.0` – `1.0`
Max tokens	`MAX_TOKENS`	`--max-tokens` / `-k`	model default	provider-dependent

TEMPERATURE controls how creative or deterministic the output is. Lower values (closer to 0.0) produce more consistent, factual notes. Higher values (closer to 1.0) produce more varied phrasing. MAX_TOKENS caps the length of each LLM response. If not set, NoteWise lets LiteLLM use the model’s built-in default. For long chapters, a low MAX_TOKENS may cause notes to be truncated — leave this unset unless you have a specific reason to cap output length.

Some providers (such as OpenAI’s o-series reasoning models) do not support the temperature parameter. Sending it will cause an API error. If you use those models, set TEMPERATURE=1.0 in your config as a workaround, or check the provider’s documentation for the correct value.

Using any LiteLLM-compatible model

Because NoteWise passes model strings directly to LiteLLM, you can use any model that LiteLLM supports beyond the eight providers above — including locally-hosted models via Ollama:

notewise process "URL" --model ollama/llama3

For models outside the native provider list, set the required API key or base URL in the environment according to LiteLLM’s documentation. NoteWise will not validate the key or warn if it is missing for unknown providers.

Getting Started

Guides

How It Works

Supported providers

Setting your API key

Overriding the model per run

LiteLLM model string format

Provider details

Generation parameters

Using any LiteLLM-compatible model

Build docs developers (and LLMs) love

Getting Started

Guides

How It Works

​Supported providers

​Setting your API key

​Overriding the model per run

​LiteLLM model string format

​Provider details

​Generation parameters

​Using any LiteLLM-compatible model

Build docs developers (and LLMs) love

Supported providers

Setting your API key

Overriding the model per run

LiteLLM model string format

Provider details

Generation parameters

Using any LiteLLM-compatible model