AI Model Support on Livepeer

What models can run on Livepeer, and which are better suited?

This page lists all known model families commonly used via ComfyUI, with compatibility ratings for Livepeer’s real-time, GPU-worker constraints. Nothing here implies that a listed model is officially supported or pre-loaded on the network - it reflects whether a model’s execution shape fits Livepeer well.

Legend

✓ Likely runnable - fits real-time / GPU-worker constraints
⚠ Conditional - depends on latency, VRAM, orchestration, or batching
✗ Not suitable - design mismatch: stateful, CPU-bound, or non-deterministic

1. Diffusion Models (Image / Video)

Stable Diffusion family

Model	Fit	Notes

Why blocked (DeepFloyd): VRAM pressure, multi-stage graphs, inference latency.

Video diffusion models

Model	Fit	Notes

Why blocked (batch video): temporal state, batch-only execution, non-real-time.

2. Control & Conditioning Models

ControlNet

Model	Fit	Notes

T2I / I2I Adapters

Model	Fit	Notes

3. Encoders, VAEs, and Latents

Model	Fit	Notes

4. Vision Models (Non-Diffusion)

Detection / Segmentation

Model	Fit	Notes

Depth / Geometry

Model	Fit	Notes

5. Face, Pose & Human Models

Model	Fit	Notes

6. Audio & Music Models

Model	Fit	Notes

Why blocked: long context windows, non-frame-based execution.

For real-time audio workloads (live ASR, live translation, streaming transcription), see Workload Fit → ASR pipeline examples. These use Whisper or similar and are excellent fits.

7. Multimodal & VLMs

Model	Fit	Notes

8. LLMs (Text-Centric)

Model	Fit	Notes

Why blocked: token streaming, memory residency, orchestration mismatch.

9. 3D / NeRF / World Models

Model	Fit	Notes

10. Utility / Pre/Post Models

Model	Fit	Notes

Core takeaway

ComfyUI can orchestrate almost any PyTorch model. But:

Livepeer favours stateless, frame-based, deterministic inference
Long-running, stateful, or batch-only models are fundamentally incompatible
Real-time video imposes hard physics limits, not software ones

This matrix is intentionally conservative. If your model doesn’t appear here, apply the Workload Fit decision tree to evaluate it.

Workload Fit

Decision tree for evaluating whether your use case belongs on Livepeer.

BYOC

Bring your own container - run custom models via ComfyUI or a custom server.

Building on Livepeer

Quickstart

AI Pipelines

Guides & Tutorials

Builder Opportunities

Technical References

AI Model Support on Livepeer

Legend

1. Diffusion Models (Image / Video)

Stable Diffusion family

Video diffusion models

2. Control & Conditioning Models

ControlNet

T2I / I2I Adapters

3. Encoders, VAEs, and Latents

4. Vision Models (Non-Diffusion)

Detection / Segmentation

Depth / Geometry

5. Face, Pose & Human Models

6. Audio & Music Models

7. Multimodal & VLMs

8. LLMs (Text-Centric)

9. 3D / NeRF / World Models

10. Utility / Pre/Post Models

Core takeaway

See also

Workload Fit

BYOC

Building on Livepeer

Quickstart

AI Pipelines

Guides & Tutorials

Builder Opportunities

Technical References

​Legend

​1. Diffusion Models (Image / Video)

​Stable Diffusion family

​Video diffusion models

​2. Control & Conditioning Models

​ControlNet

​T2I / I2I Adapters

​3. Encoders, VAEs, and Latents

​4. Vision Models (Non-Diffusion)

​Detection / Segmentation

​Depth / Geometry

​5. Face, Pose & Human Models

​6. Audio & Music Models

​7. Multimodal & VLMs

​8. LLMs (Text-Centric)

​9. 3D / NeRF / World Models

​10. Utility / Pre/Post Models

​Core takeaway

​See also

Workload Fit

BYOC

Legend

1. Diffusion Models (Image / Video)

Stable Diffusion family

Video diffusion models

2. Control & Conditioning Models

ControlNet

T2I / I2I Adapters

3. Encoders, VAEs, and Latents

4. Vision Models (Non-Diffusion)

Detection / Segmentation

Depth / Geometry

5. Face, Pose & Human Models

6. Audio & Music Models

7. Multimodal & VLMs

8. LLMs (Text-Centric)

9. 3D / NeRF / World Models

10. Utility / Pre/Post Models

Core takeaway

See also