Decensor your first language model with Heretic in minutes
This guide will walk you through decensoring your first language model using Heretic. The entire process is automatic and requires just a single command.
To decensor a model, simply run Heretic with the model name from Hugging Face:
heretic Qwen/Qwen3-4B-Instruct-2507
You can use any model identifier from Hugging Face, or a local path to a model directory.
Heretic will automatically download the model if it’s not already cached locally.
2
System Benchmarking
Heretic first detects your hardware and automatically determines the optimal batch size:
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.2.0█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/hereticDetected 1 CUDA device(s) (24.00 GB total VRAM):* GPU 0: NVIDIA GeForce RTX 3090 (24.00 GB)Loading model Qwen/Qwen3-4B-Instruct-2507...* Memory usage: 8.2 GBDetermining optimal batch size...* Trying batch size 1... Ok (142 tokens/s)* Trying batch size 2... Ok (267 tokens/s)* Trying batch size 4... Ok (489 tokens/s)* Trying batch size 8... Ok (612 tokens/s)* Trying batch size 16... Failed (CUDA out of memory)* Chosen batch size: 8
This automatic benchmarking ensures optimal performance for your hardware.
3
Optimization Process
Heretic now runs parameter optimization trials (default: 200 trials) to find the best abliteration parameters:
Loading good prompts from mlabonne/harmless_alpaca...* 400 prompts loadedLoading bad prompts from mlabonne/harmful_behaviors...* 400 prompts loadedCalculating per-layer refusal directions...* Obtaining residuals for good prompts...* Obtaining residuals for bad prompts...Running trial 1 of 200...* Parameters: * direction_scope = global * direction_index = 15.3 * attn_out.max_weight = 1.12 * attn_out.max_weight_position = 22.4 * mlp_down.max_weight = 0.94 ...* Resetting model...* Abliterating...* Evaluating...* Score: 0.2341 (Refusals: 12/100, KL divergence: 0.2341)Elapsed time: 2m 15sEstimated remaining time: 7h 28m
You can interrupt the optimization at any time with Ctrl+C. Heretic saves progress and you can continue later.
4
Select Best Result
After optimization completes, Heretic presents you with Pareto optimal results:
Optimization finished!The following trials resulted in Pareto optimal combinations of refusalsand KL divergence. After selecting a trial, you will be able to save themodel, upload it to Hugging Face, or chat with it to test how well it works.Which trial do you want to use?> [Trial 87] Refusals: 2/100, KL divergence: 0.1847 [Trial 142] Refusals: 3/100, KL divergence: 0.0923 [Trial 178] Refusals: 4/100, KL divergence: 0.0451 Run additional trials Exit program
Select a trial that balances refusal suppression with capability preservation. Lower KL divergence means less damage to the original model.
KL divergence values above 1.0 usually indicate significant damage to the model’s capabilities.
5
Export or Test Model
After selecting a trial, choose what to do with the decensored model:
What do you want to do with the decensored model?> Save the model to a local folder Upload the model to Hugging Face Chat with the model Return to the trial selection menu
Options:
Save locally: Export the model to a directory for later use
Upload to HF: Publish your decensored model on Hugging Face
# Provide your HF token when promptedHugging Face access token: [enter token]Logged in as John Doe ([email protected])Name of repository: username/qwen3-4b-hereticShould the repository be public or private?> Public PrivateUploading merged model...Model uploaded to username/qwen3-4b-heretic.
Heretic automatically adds appropriate tags (heretic, uncensored, abliterated) and prepends performance metrics to the model card.
Press Ctrl+C at any time to return to the menu.> User: Tell me about artificial intelligenceAssistant: Artificial intelligence (AI) refers to computer systems that can performtasks that typically require human intelligence, such as visual perception, speechrecognition, decision-making, and language translation...> User: [Press Ctrl+C to exit]
The chat feature uses the same system prompt configured in Heretic (default: “You are a helpful assistant.”).
Heretic automatically saves optimization progress to the checkpoints/ directory. If a run is interrupted, Heretic will detect the checkpoint and ask if you want to continue:
You have already processed this model, but the run was interrupted.You can continue the previous run from where it stopped.How would you like to proceed?> Continue the previous run Ignore the previous run and start from scratch Exit program
Select “Continue the previous run” to resume optimization from where it stopped.