MCPcopy
hub / github.com/PKU-YuanGroup/Helios

github.com/PKU-YuanGroup/Helios @main sqlite

repository ↗ · DeepWiki ↗
971 symbols 3,542 edges 122 files 208 documented · 21%
README

Helios: Real Real-Time Long Video Generation Model

⭐ 14B Real-Time Long Video Generation Model can be Cheaper, Faster but Keep Stronger than 1.3B ones ⭐
[![arXiv](https://img.shields.io/badge/arXiv-2603.04379-b31b1b.svg?logo=arxiv)](https://arxiv.org/abs/2603.04379) [![hf_paper](https://img.shields.io/badge/🤗-Paper%20In%20HF-red.svg)](https://huggingface.co/papers/2603.04379) [![Project Page](https://img.shields.io/badge/Project-Website-2ea44f)](https://pku-yuangroup.github.io/Helios-Page) [![hf_space](https://img.shields.io/badge/🤗-Gradio-00b4d8.svg)](https://huggingface.co/spaces/BestWishYsh/Helios-14B-RealTime) [![HuggingFace](https://img.shields.io/badge/🤗-HuggingFace-blue)](https://huggingface.co/collections/BestWishYsh/helios) [![ModelScope](https://img.shields.io/badge/🤖-ModelScope-purple)](https://modelscope.cn/collections/BestWishYSH/Helios) [![GitHub](https://img.shields.io/badge/GitHub-black?logo=github)](https://github.com/PKU-YuanGroup/Helios) [![GitCode](https://img.shields.io/badge/GitCodes-blue?logo=gitcode)](https://gitcode.com/weixin_47617277/Helios) [![Ascend](https://img.shields.io/badge/Inference-Ascend--NPU-red)](https://www.hiascend.com/) [![Diffusers](https://img.shields.io/badge/Inference-Diffusers-blueviolet)](https://github.com/huggingface/diffusers/pull/13208) [![SGLang Diffusion](https://img.shields.io/badge/Backend-SGLang--Diffusion-yellow)](https://github.com/sgl-project/sglang/pull/19782) [![vLLM-Omni](https://img.shields.io/badge/Backend-vLLM--Omni-orange)](https://github.com/vllm-project/vllm-omni/pull/1604)

This repository is the official implementation of Helios, which is a breakthrough video generation model that achieves minute-scale, high-quality video synthesis at 19.5 FPS on a single H100 GPU (about 10 FPS on a single Ascend NPU) —without relying on conventional long video anti-drifting strategies or standard video acceleration techniques.

✨ Highlights

  1. Without commonly used anti-drifting strategies (e.g., self-forcing, error-banks, keyframe sampling, or inverted sampling), Helios generates minute-scale videos with high quality and strong coherence.

  2. Without standard acceleration techniques (e.g., KV-cache, causal masking, sparse/linear attention, TinyVAE, progressive noise schedules, hidden-state caching, or quantization), Helios achieves 19.5 FPS in end-to-end inference on a single H100 GPU.

  3. We introduce optimizations that improve both training and inference throughput while reducing memory consumption, enabling image-diffusion-scale batch sizes during training while fitting up to four 14B models within 80 GB of GPU memory.

🎬 Video Demos

Demo Video of Helios or you can click here to get the video. Some best prompts are here.

📣 Latest News!!

  • [2026.03.26] 🔥 Add summary of FAQ, Tips, and Tutorals: https://github.com/PKU-YuanGroup/Helios/issues/47.
  • [2026.03.24] 👋 A community-made, unofficial YouTube tutorial for Helios is available here. It covers installation on a consumer-grade PC and supports 4K video generation.
  • [2026.03.20] 🚀 Helios now supports Ahead-of-Time Compilation (AOTI) on Spaces, with special thanks to the HuggingFace Team! Please refer to this Space for a usage example.
  • [2026.03.20] 🔧 Based on issue #38, we've identified several ways to further improve Helios's performance, such as fixing the i2v train-inference inconsistency and fully enabling Easy Anti-Drifting. Please refer to commits and correct.yaml for details.
  • [2026.03.12] ⚡️ Please note that real-time generation performance depends not only on the GPU, but also on the CPU, memory, CUDA driver version, etc. As tested by a user on better hardware with single H100, Helios can reach up to 20.89 FPS!
  • [2026.03.08] 🚀 Helios now fully supports Group Offloading and Context Parallelism! These features significantly optimize VRAM (only ~6GB) usage and enable inference across multiple GPUs with Ulysses Attention, Ring Attention, Unified Attention, and Ulysses Anything Attention.
  • [2026.03.06] 👋 Cache-DiT now supports Helios, it offers Fully Cache Acceleration and Parallelism support for Helios! Special thanks to the Cache-DiT Team for their amazing work.
  • [2026.03.06] 🔧 We fix the Parallel Inference logits for Helios, and provide an example here.
  • [2026.03.06] 🚀 We official release the Gradio Demo, welcome to try it.
  • [2026.03.05] 🔥 We are excited to announce the release of the Helios technical report on arXiv. We welcome discussions and feedback!
  • [2026.03.04] 👋 Day-0 support for Ascend-NPU,with sincere gratitude to the Ascend Team for their support.
  • [2026.03.04] 👋 Day-0 support for Diffusers,with special thanks to the HuggingFace Team for their support.
  • [2026.03.04] 👋 Day-0 support for SGLang-Diffusion,with huge thanks to the SGLang Team for their support.
  • [2026.03.04] 👋 Day-0 support for vLLM-Omni,with heartfelt gratitude to the vLLM Team for their support.
  • [2026.03.04] 🔥 We've released the training/inference code and weights of Helios-Base, Helios-Mid and Helios-Distilled.

🔥 Friendly Links

If your work has improved Helios and you would like more people to see it, please inform us.

  • Ascend-NPU: Developed by Huawei, this hardware is designed for efficient AI model training and inference, boosting performance in tasks like computer vision, natural language processing, and autonomous driving.
  • Diffusers: A popular library designed for working with diffusion models and other generative models in deep learning. It supports easy integration and manipulation of a wide range of generative models.
  • SGLang-Diffusion: An inference framework for accelerated image and video generation using diffusion models. It provides an end-to-end unified pipeline with optimized kernels and an efficient scheduler loop.
  • vLLM-Omni: A fully disaggregated serving system for any-to-any models. vLLM-Omni breaks complex architectures into a stage-based graph, using a decoupled backend to maximize resource efficiency and throughput.
  • Cache-DiT: A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs. It built on top of the Diffusers library and now supports nearly ALL DiTs from Diffusers.

⚙️ Requirements and Installation

Video Tutorial

If you prefer a step-by-step walkthrough, check out this community-made YouTube Tutorial. It covers local installation, 4K video generation, and how to run Helios on a consumer-grade PC, along with other practical usage tips.

Prepare Environment

# 0. Clone the repo
git clone --depth=1 https://github.com/PKU-YuanGroup/Helios.git
cd Helios

# 1. Create conda environment
conda create -n helios python=3.11.2
conda activate helios

# 2. Install PyTorch (adjust for your CUDA version)
# CUDA 12.6
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu126
# CUDA 12.8
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu128
# CUDA 13.0
pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu130

# 3. Install dependencies
bash install.sh

Model Download

Models Download Link Supports Notes
Helios-Base 🤗 Huggingface 🤖 ModelScope T2V ✅ I2V ✅ V2V ✅ Interactive ✅ Best Quality, with v-prediction, standard CFG and custom HeliosScheduler.
Helios-Mid 🤗 Huggingface 🤖 ModelScope T2V ✅ I2V ✅ V2V ✅ Interactive ✅ Intermediate Ckpt, with v-prediction, CFG-Zero* and custom HeliosScheduler.
Helios-Distilled 🤗 Huggingface 🤖 ModelScope T2V ✅ I2V ✅ V2V ✅ Interactive ✅ Best Efficiency, with x0-prediction and custom HeliosDMDScheduler.

💡Note: * All three models share the same architecture, but Helios-Mid and Helios-Distilled use a more aggressive multi-scale sampling pipeline to achieve better efficiency. * Helios-Mid is an intermediate checkpoint generated in the process of distilling Helios-Base into Helios-Distilled, and may not meet expected quality. * For Image-to-Video or Video-to-Video, since training is based on Text-to-Video, these two functions may be slightly inferior to Text-to-Video. You may enable is_skip_first_chunk if you find the first few chunks are static or imporve the value of image_noise_sigma_min, image_noise_sigma_max, video_noise_sigma_min, and video_noise_sigma_max.

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download BestWishYSH/Helios-Base --local-dir BestWishYSH/Helios-Base
huggingface-cli download BestWishYSH/Helios-Mid --local-dir BestWishYSH/Helios-Mid
huggingface-cli download BestWishYSH/Helios-Distilled --local-dir BestWishYSH/Helios-Distilled

Download models using modelscope-cli:

pip install modelscope
modelscope download BestWishYSH/Helios-Base --local_dir BestWishYSH/Helios-Base
modelscope download BestWishYSH/Helios-Mid --local_dir BestWishYSH/Helios-Mid
modelscope download BestWishYSH/Helios-Distilled --local_dir BestWishYSH/Helios-Distilled

🚀 Inference

Helios uses an autoregressive approach that generates 33 frames per chunk. For optimal performance, num_frames should be set to a multiple of 33. If a non-multiple value is provided, it will be automatically rounded up to the nearest multiple of 33.

Example frame counts for different video lengths:

num_frames Adjusted Frames 24 FPS 16 FPS
1449 1452 (33×44) ~60s (1min) ~90s (1min 30s)
720 726 (33×22) ~30s ~45s
240 264 (33×8) ~11s ~16s
129 132 (33×4) ~5.5s ~8s
81 99 (33×3) ~4s ~6s

Run the model

We provide inference scripts for all models covering text-to-video, image-to-video, and video-to-video in this directory.

cd scripts/inference

# For Helios-Base
bash helios-base_t2v.sh
bash helios-base_i2v.sh
bash helios-base_v2v.sh

# For Helios-Mid
bash helios-mid_t2v.sh
bash helios-mid_i2v.sh
bash helios-mid_v2v.sh

# For Helios-Distilled
bash helios-distilled_t2v.sh
bash helios-distilled_i2v.sh
bash helios-distilled_v2v.sh

# For Interactive
# ⚠️ This feature is still under development — results may not always meet expectations
cd scripts/inference/experiment_interactive

Sanity Check

Before trying your own inputs, we highly recommend going through the sanity check to find out if any hardware or software went wrong.

Task Helios-Base Helios-Mid Helios-Distilled

Core symbols most depended-on inside this repo

to
called by 447
helios/utils/create_ema_zero3.py
update
called by 228
eval/utils/third_party/amt/utils/utils.py
from_pretrained
called by 49
helios/utils/create_ema_zero3.py
load_state_dict
called by 45
helios/utils/create_ema_zero3.py
resize
called by 42
eval/utils/third_party/amt/networks/blocks/ifrnet.py
read
called by 37
eval/utils/third_party/amt/utils/utils.py
encode
called by 29
eval/utils/third_party/ViCLIP/simple_tokenizer.py
img2tensor
called by 26
eval/utils/third_party/amt/utils/utils.py

Shape

Method 467
Function 380
Class 124

Languages

Python100%

Modules by API surface

helios/modules/transformer_helios.py52 symbols
helios/diffusers_version/transformer_helios_diffusers.py30 symbols
eval/utils/third_party/amt/losses/loss.py30 symbols
helios/utils/utils_helios_post.py28 symbols
eval/utils/third_party/amt/utils/utils.py28 symbols
helios/diffusers_version/scheduling_helios_diffusers.py27 symbols
helios/utils/utils_base.py24 symbols
helios/pipelines/pipeline_helios_ode.py24 symbols
helios/pipelines/pipeline_helios.py24 symbols
helios/diffusers_version/pipeline_helios_diffusers.py24 symbols
helios/dataset/dataloader_mp4_dist.py24 symbols
eval/utils/third_party/ViCLIP/viclip_vision.py22 symbols

Dependencies from manifests, versioned

Brotli1.1.0 · 1×
Cython3.1.2 · 1×
Deprecated1.2.18 · 1×
Flask2.3.3 · 1×
GPUtil1.4.0 · 1×
GitPython3.1.44 · 1×
ImageIO2.37.2 · 1×
Jinja23.1.3 · 1×
Markdown3.8.2 · 1×
MarkupSafe2.1.5 · 1×
PyGObject3.42.2 · 1×
PyJWT2.10.1 · 1×

For agents

$ claude mcp add Helios \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact