Skip to main content
nrvna-ai is configured entirely through environment variables. Set these before starting the daemon to customize behavior.

Core Configuration

VariableDefaultDescription
NRVNA_WORKERS4Number of worker threads for parallel inference
NRVNA_LOG_LEVELinfoLog verbosity level (see Logging)
NRVNA_MODELS_DIR./models/Directory path to search for model files

Model Parameters

See Model Parameters for detailed explanations.
VariableDefaultDescription
NRVNA_GPU_LAYERS99 (Mac) / 0 (other)Number of model layers to offload to GPU
NRVNA_PREDICT2048Maximum number of tokens to generate
NRVNA_MAX_CTX8192Context window size (max tokens model can process)
NRVNA_TEMP0.8Sampling temperature (0.0 - 2.0)
NRVNA_TOP_K40Top-K sampling parameter
NRVNA_TOP_P0.9Top-P (nucleus) sampling parameter
NRVNA_MIN_P0.05Minimum probability threshold
NRVNA_REPEAT_PENALTY1.1Penalty for repeating tokens
NRVNA_REPEAT_LAST_N64Number of previous tokens to consider for repeat penalty
NRVNA_SEED0Random seed for reproducibility (0 = random)
NRVNA_BATCH2048Batch size for token processing

llama.cpp Logging

VariableDefaultDescription
LLAMA_LOG_LEVEL-llama.cpp log level: error, warn, info, debug

Example Configuration

# High-performance setup with GPU acceleration
export NRVNA_WORKERS=8
export NRVNA_GPU_LAYERS=99
export NRVNA_MAX_CTX=16384
export NRVNA_PREDICT=4096
export NRVNA_LOG_LEVEL=info

# Start daemon
nrvnad model.gguf workspace
# Conservative setup for CPU-only inference
export NRVNA_WORKERS=2
export NRVNA_GPU_LAYERS=0
export NRVNA_MAX_CTX=4096
export NRVNA_PREDICT=1024
export NRVNA_TEMP=0.7

nrvnad model.gguf workspace
# Debugging configuration
export NRVNA_LOG_LEVEL=debug
export LLAMA_LOG_LEVEL=info

nrvnad model.gguf workspace

Build docs developers (and LLMs) love