Skip to content
Pixal3D

Pixal3D

Generates pixel-aligned high-fidelity 3D models with PBR textures from a single image

Features

Open Source3D

System Requirements

32GB RAM recommended. 42GB+ storage recommended.
Windows 10/11 64-bit: NVIDIA GPU with 16GB+ VRAM required.
Note: For NVIDIA GPUs, install a newer driver.

Introduction

Notice & Disclaimer

  1. Code & Models: The core code uses the MIT license, but certain built-in third-party weights are for non-commercial use only (CC BY-NC 4.0).
  2. Fees: Your payment covers local environment setup and technical support only, not commercial licenses for any models.
  3. Commercial Risk: This tool is for research/educational use only. For commercial purposes, please obtain licenses from original authors (e.g., Meta) at your own risk.

1. Project Basic Information

Pixal3D is an open-source project jointly developed by Tencent ARC Lab, Tsinghua University, and Victoria University of Wellington. It has been accepted by SIGGRAPH 2026, a top international conference, focusing on pixel-aligned high-fidelity 3D generation from a single 2D image.

2. Core Functions & Product Features

  1. One-Click Image-to-3D Conversion: Upload only one 2D image to automatically generate a universal GLB-format 3D mesh model with delicate geometric structures and standard PBR physical textures.
  2. Pixel-Level Precise Alignment: Different from traditional methods that loosely fuse image features via attention mechanisms, it establishes a one-to-one correspondence between 2D pixels and 3D space through back-projection technology, restoring original image details to the maximum with results close to professional 3D reconstruction level.
  3. Low-VRAM Compatibility: Built-in low-VRAM mode loads models on-demand to reduce peak VRAM usage, enabling smooth operation on consumer-grade GPUs, and supports custom generation resolution.
  4. Flexible Usage Modes: It provides command-line inference and Gradio web interactive demo. Complete training code and data preprocessing toolkit are open for secondary training and customized development by developers.
  5. Dual Versions for Choice: The main branch adopts Trellis.2 backbone with better performance; the paper branch is based on Direct3D-S2 for reproducing experimental results in the paper.

3. Application Scenarios

It is widely applied in game 3D asset production, film and animation modeling, metaverse digital asset creation, AR/VR virtual object generation, e-commerce 3D product display, cultural and creative digital modeling, etc., greatly lowering the threshold of 3D asset production.

4. Underlying Core Technology

It takes Trellis.2 and Direct3D-S2 as the basic backbone networks. Adopting a three-stage cascade generation framework (sparse structure → shape refinement → texture generation), combined with 3D latent diffusion model, pixel back-projection feature lifting, sparse VAE and dense VAE decoding technology to achieve high-precision 3D generation with progressive multi-resolution.