Hunyuan3D 2.0
Hunyuan3D 2.0 generates high-quality 3D models from single images or multiple views, outputting voxel representations that can be converted to meshes.
Architecture
Model Configuration:- Image model: hunyuan3d2
- Latent format: Hunyuan3Dv2
- Memory usage factor: 3.5
- Sampling: Flow matching (shift 1.0, multiplier 1.0)
- Diffusion Model: DiT-based 3D generator
- VAE: 3D voxel compression/decompression
- CLIP Vision Encoder: Image conditioning
- Channels: 64
- Resolution: Configurable (default 3072 voxels)
Usage
Creating Empty Latent:Voxel to Mesh Conversion
Basic Conversion:- Calculates surface intersections per voxel
- Generates smoother, more organic shapes
- Aligns faces with density gradients
- Shows progress bar (can be slow for high resolution)
- Creates cube faces for each solid voxel
- Removes internal faces
- Much faster but blockier output
Mesh Export
Saving 3D Models:- GLB (binary glTF) - default and recommended
- GLTF (JSON glTF)
- OBJ
- FBX
- STL
- USDZ
- Workflow JSON stored in
asset.extras - Includes prompt and generation settings
- Disable with
--disable-metadataflag
Hunyuan3D 2.1
Architecture Updates
Model Configuration:- Image model: hunyuan3d2_1
- Latent format: Hunyuan3Dv2_1
- Same memory and performance characteristics
- Enhanced quality
- Better geometry
- Improved texture consistency
- Identical nodes and workflow as 2.0
- Drop-in replacement
Hunyuan3D 2.0 Mini
Architecture
Model Configuration:- Image model: hunyuan3d2
- Depth: 8 (reduced from standard)
- Latent format: Hunyuan3Dv2mini
- Faster inference, lower quality
- Rapid prototyping
- Batch generation
- Preview generation
- Resource-constrained environments
Data Types
VOXEL Type
Structure:- Shape: [depth, height, width]
- Values: Density/occupancy (-1.0 to 1.0)
- Normalized to [-1, 1] range
MESH Type
Structure:- Origin at center
- Normalized to [-1, 1] range
- Right-handed coordinate system
- Faces use vertex indices (counter-clockwise winding)
File3D Type
Structure:- Can save File3D directly (bypasses mesh conversion)
- Preserves original format or converts to GLB
Workflow Examples
Single Image to 3D
Multi-View 3D Reconstruction
Model Files
Default Locations
File Sizes
Hunyuan3D 2.0:- DiffusionModel: ~8-12GB
- VAE: ~2-3GB
- CLIP Vision: ~1-2GB
- DiffusionModel: ~3-5GB (smaller depth)
- Same VAE and CLIP Vision
Performance Optimization
Memory Management
VRAM Requirements:- Generation: 8-16GB (depending on resolution)
- VAE Decode: 4-8GB (depends on octree_resolution)
- Mesh Conversion: 2-8GB (depends on voxel resolution and algorithm)
Resolution Settings
Latent Resolution:- Low quality (fast): 1536-2048 voxels
- Medium quality: 2048-3072 voxels
- High quality: 3072-4096 voxels
- Ultra quality (slow): 4096-8192 voxels
- Preview: 128
- Standard: 256
- High detail: 384-512
- Must be power of 2
- Lower = slower but more memory efficient
- Higher = faster but more VRAM
- Default 8000 works well for most GPUs
- Range: 1000-500000
Speed Optimization
Generation:- Use fewer sampling steps (20-30 sufficient)
- CFG scale of 1.0 is often best for 3D
- Mini model for rapid iteration
- Start with basic algorithm for preview
- Use surface net for final export
- Lower threshold = more geometry = slower
- Higher threshold = less geometry = faster
- 0.5: Maximum detail (slow, large files)
- 0.6: Good balance (default)
- 0.7-0.8: Simplified geometry (fast, smaller files)
Advanced Techniques
Multi-Resolution Workflow
- Preview (resolution=1536, basic algorithm, threshold=0.7)
- Refine (resolution=3072, surface net, threshold=0.6)
- Final (resolution=4096, surface net, threshold=0.5)
Batch Processing
View-Specific Conditioning
Turntable Setup:- Use front view only for symmetric objects
- Add left/right for asymmetric objects
- Add back view for complete coverage
- Front: 0°
- Left: 90°
- Back: 180°
- Right: 270°
Export and Integration
3D Software Compatibility
Blender:- Import GLB directly (File → Import → glTF 2.0)
- Preserves geometry and structure
- No materials/textures (geometry only)
- Drag GLB into Assets folder
- Auto-converts to prefab
- Import via Datasmith or FBX export from Blender
- GLB requires plugin
- Export as FBX for better compatibility
- Or import GLB via plugins
Post-Processing
Recommended Workflow:- Import mesh into Blender
- Apply Remesh modifier (if needed)
- Add materials and textures
- UV unwrap
- Bake textures from reference image
- Export to final format
File Size Optimization
Reduce Polygon Count:- Increase threshold value (0.7-0.8)
- Use Blender’s Decimate modifier
- GLB is already compressed
- Draco compression for web use (via gltf-pipeline)
Troubleshooting
Common Issues
Empty/No Geometry:- Threshold too high (reduce to 0.5-0.6)
- Invalid input image (needs clear subject)
- Insufficient sampling steps
- Use surface net algorithm instead of basic
- Increase latent resolution
- Increase octree_resolution
- Reduce resolution (3072 → 2048)
- Reduce octree_resolution (256 → 128)
- Reduce num_chunks (8000 → 4000)
- Enable —lowvram flag
- Use mini model for previews
- Reduce resolution
- Use basic algorithm for quick tests
- Blender: Select all → Mesh → Normals → Recalculate Outside
- Or flip winding order in code