Skip to main content

Heretic

Fully automatic censorship removal for language models using directional ablation and parameter optimization. Remove safety alignment from transformer models without expensive post-training.

Get Started

Remove censorship from any language model in minutes.

Installation

Install Heretic and set up your environment

Quickstart

Decensor your first model in under 5 minutes

How It Works

Learn about directional ablation and optimization

CLI Reference

Complete command-line interface documentation

Key Features

Fully Automatic

No manual tuning required. Heretic automatically finds optimal abliteration parameters using TPE-based optimization.

Quality Preserved

Achieves best-in-class KL divergence, preserving model intelligence while removing refusals.

Quantization Support

Run on consumer hardware with bitsandbytes 4-bit quantization support.

Built-in Evaluation

Evaluate models with refusal counting and KL divergence metrics out of the box.

Research Tools

Advanced residual vector analysis and PaCMAP projections for interpretability research.

Hugging Face Integration

Seamlessly upload and share your models on Hugging Face Hub.

Proven Results

Heretic produces decensored models that rival manually-created abliterations while preserving more of the original model’s capabilities.
Example: Gemma-3-12B-IT
  • Refusals: 3/100 (same as manual abliterations)
  • KL Divergence: 0.16 (vs. 1.04 for manual methods)
  • Result: Same refusal suppression with 85% less damage to model capabilities

Quick Example

# Install Heretic
pip install -U heretic-llm

# Decensor a model (fully automatic)
heretic Qwen/Qwen3-4B-Instruct-2507

# With quantization for lower VRAM usage
heretic --quantization bnb_4bit meta-llama/Llama-3.1-8B-Instruct

What You Can Do

Remove Safety Alignment

Strip censorship from instruction-tuned models without retraining

Optimize Parameters

Automatically find the best ablation weights for your model

Evaluate Models

Measure refusal rates and KL divergence from baseline

Analyze Internals

Visualize residual vectors and refusal directions

Community

GitHub Repository

View source code, report issues, and contribute

Hugging Face Models

Browse 1,000+ community models created with Heretic

Build docs developers (and LLMs) love