Skip to main content

Quick Start Guide

Learn how to compute perceptual hashes and compare images using imghash.

Choose an Algorithm

If you’re unsure which hash to pick, start with PDQ. It’s Facebook’s production-grade algorithm designed for robust content moderation and duplicate detection.
PDQ (PhotoDNA-Quality) is ideal for:
  • Content moderation and abuse detection
  • Near-duplicate image detection
  • Robustness to JPEG compression, rescaling, and minor edits
  • Large-scale image deduplication

Basic Example: Compare Two Images

Here’s a complete working example using the PDQ algorithm:
package main

import (
    "fmt"
    "github.com/ajdnik/imghash/v2"
)

func main() {
    // Create a PDQ hasher
    pdq, err := imghash.NewPDQ()
    if err != nil {
        panic(err)
    }

    // Hash two image files
    h1, err := imghash.HashFile(pdq, "image1.png")
    if err != nil {
        panic(err)
    }

    h2, err := imghash.HashFile(pdq, "image2.png")
    if err != nil {
        panic(err)
    }

    // Compare the hashes
    dist, err := pdq.Compare(h1, h2)
    if err != nil {
        panic(err)
    }

    fmt.Printf("Distance: %v\n", dist)
    
    // Interpret the result
    if dist < 32 {
        fmt.Println("Images are very similar or near-duplicates")
    } else if dist < 64 {
        fmt.Println("Images are somewhat similar")
    } else {
        fmt.Println("Images are different")
    }
}

Step-by-Step Breakdown

1

Create a hasher

Instantiate the hash algorithm you want to use:
pdq, err := imghash.NewPDQ()
if err != nil {
    panic(err)
}
PDQ uses sensible defaults (256-bit binary hash, bilinear interpolation). You can customize behavior with options:
pdq, err := imghash.NewPDQ(
    imghash.WithPDQInterpolation(imghash.Bicubic),
)
2

Hash your images

Use the convenience function HashFile() to hash image files:
h1, err := imghash.HashFile(pdq, "image1.png")
if err != nil {
    panic(err)
}
The HashFile() function:
  • Opens the image file
  • Decodes it (supports JPEG, PNG, GIF)
  • Computes the perceptual hash
  • Returns a Hash interface
For PDQ, this returns a Binary hash of 256 bits (32 bytes).
3

Compare the hashes

Use the algorithm’s Compare() method to measure similarity:
dist, err := pdq.Compare(h1, h2)
if err != nil {
    panic(err)
}
fmt.Printf("Distance: %v\n", dist)
For PDQ (and most binary hashes), this computes Hamming distance - the number of differing bits.
  • Distance = 0 - Identical images
  • Distance < 32 - Very similar or near-duplicates (for 256-bit hashes)
  • Distance > 128 - Completely different images

Alternative Hash Sources

Hash from io.Reader

For streaming or HTTP sources:
import (
    "net/http"
    "github.com/ajdnik/imghash/v2"
)

resp, err := http.Get("https://example.com/image.jpg")
if err != nil {
    panic(err)
}
defer resp.Body.Close()

h, err := imghash.HashReader(pdq, resp.Body)
if err != nil {
    panic(err)
}

Hash from image.Image

If you already have an image.Image object:
import (
    "image"
    "os"
    "github.com/ajdnik/imghash/v2"
)

f, err := os.Open("photo.jpg")
if err != nil {
    panic(err)
}
defer f.Close()

img, _, err := image.Decode(f)
if err != nil {
    panic(err)
}

// Hash the image directly
pdq, _ := imghash.NewPDQ()
h, err := pdq.Calculate(img)
if err != nil {
    panic(err)
}

Try Other Algorithms

package main

import (
    "fmt"
    "github.com/ajdnik/imghash/v2"
)

func main() {
    // Simple 64-bit average hash
    avg, err := imghash.NewAverage()
    if err != nil {
        panic(err)
    }

    h1, err := imghash.HashFile(avg, "image1.png")
    if err != nil {
        panic(err)
    }

    h2, err := imghash.HashFile(avg, "image2.png")
    if err != nil {
        panic(err)
    }

    dist, err := avg.Compare(h1, h2)
    if err != nil {
        panic(err)
    }

    fmt.Printf("Hamming distance: %v\n", dist)
}

Working with Hash Types

Binary Hashes

Binary hashes store bits efficiently:
pdq, _ := imghash.NewPDQ()
h, _ := imghash.HashFile(pdq, "image.png")

// Type assert to Binary
if binaryHash, ok := h.(imghash.Binary); ok {
    fmt.Printf("Hash bytes: %v\n", binaryHash)
    fmt.Printf("Length: %d bytes\n", binaryHash.Len())
    fmt.Printf("String: %s\n", binaryHash.String())
}

Inspecting Hashes

All hash types implement the Hash interface:
type Hash interface {
    String() string          // String representation
    Len() int                // Number of elements
    ValueAt(idx int) float64 // Value at index
}
Example:
pdq, _ := imghash.NewPDQ()
h, _ := imghash.HashFile(pdq, "image.png")

fmt.Printf("Hash: %s\n", h.String())
fmt.Printf("Length: %d\n", h.Len())
fmt.Printf("First value: %v\n", h.ValueAt(0))

Custom Distance Metrics

Override the default distance metric:
import (
    "github.com/ajdnik/imghash/v2"
    "github.com/ajdnik/imghash/v2/similarity"
)

// Use L1 distance instead of Hamming for comparison
dist, err := imghash.Compare(h1, h2, similarity.L1)
if err != nil {
    panic(err)
}
Available distance functions:
  • similarity.Hamming - Bit differences (Binary only)
  • similarity.L1 - Manhattan distance
  • similarity.L2 - Euclidean distance
  • similarity.Cosine - Cosine similarity
  • similarity.ChiSquare - Chi-square distance
  • similarity.Jaccard - Jaccard index
  • similarity.PCC - Pearson correlation coefficient
  • similarity.WeightedHamming - Weighted bit differences (Binary only)

Error Handling

Common errors to handle:
pdq, err := imghash.NewPDQ()
if err != nil {
    // Constructor errors (invalid options)
    panic(err)
}

h, err := imghash.HashFile(pdq, "image.png")
if err != nil {
    // File errors: file not found, unsupported format, corrupt image
    panic(err)
}

dist, err := pdq.Compare(h1, h2)
if err != nil {
    // Comparison errors: incompatible hash types, length mismatch
    panic(err)
}
Key error types:
  • imghash.ErrIncompatibleHash - Comparing incompatible hash types
  • imghash.ErrHashLengthMismatch - Hash lengths don’t match
  • imghash.ErrInvalidSize - Invalid dimension in constructor
  • imghash.ErrInvalidInterpolation - Invalid interpolation method

Performance Tips

Reuse hashers: Create the hasher once and reuse it for multiple images. Hashers are safe to use concurrently.
// Good - create once, use many times
pdq, _ := imghash.NewPDQ()
for _, imagePath := range images {
    h, _ := imghash.HashFile(pdq, imagePath)
    // process hash
}

// Bad - creates new hasher every iteration
for _, imagePath := range images {
    pdq, _ := imghash.NewPDQ()  // Wasteful!
    h, _ := imghash.HashFile(pdq, imagePath)
}

Complete Example: Find Similar Images

Here’s a practical example that finds similar images in a directory:
package main

import (
    "fmt"
    "os"
    "path/filepath"
    "github.com/ajdnik/imghash/v2"
)

func main() {
    // Create PDQ hasher
    pdq, err := imghash.NewPDQ()
    if err != nil {
        panic(err)
    }

    // Hash all images in directory
    hashes := make(map[string]imghash.Hash)
    err = filepath.Walk("./images", func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }
        if info.IsDir() {
            return nil
        }
        
        // Try to hash the file
        h, err := imghash.HashFile(pdq, path)
        if err != nil {
            // Skip non-image files
            return nil
        }
        
        hashes[path] = h
        return nil
    })
    if err != nil {
        panic(err)
    }

    // Find similar pairs
    const threshold = 32 // Distance threshold for "similar"
    
    paths := make([]string, 0, len(hashes))
    for path := range hashes {
        paths = append(paths, path)
    }

    for i := 0; i < len(paths); i++ {
        for j := i + 1; j < len(paths); j++ {
            dist, err := pdq.Compare(hashes[paths[i]], hashes[paths[j]])
            if err != nil {
                continue
            }
            
            if dist < threshold {
                fmt.Printf("Similar images (distance=%v):\n", dist)
                fmt.Printf("  %s\n", paths[i])
                fmt.Printf("  %s\n", paths[j])
            }
        }
    }
}

Next Steps

Now that you can compute and compare hashes:
  • Learn about algorithm-specific options and customization
  • Explore different algorithms for your use case
  • Understand distance metrics and thresholds
  • Build a production duplicate detection system
  • Integrate with databases for large-scale search

Build docs developers (and LLMs) love