This is not model hosting in the Hugging Face sense. You are hosting an inference service, not a model artefact. The distinction matters - see How Livepeer routes by capability, not model below.
What BYOC is (and isn’t)
BYOC (Bring Your Own Container) lets you run your own AI inference server inside a Docker container on a Livepeer orchestrator, and the network treats it as a callable AI capability. Livepeer does not restrict you to a fixed model catalogue or pre-approved models. Technically, any Hugging Face model can be containerised and run via BYOC. But Livepeer is optimised for low-latency, GPU-bound, real-time inference - especially for video and vision workloads. Models that violate these assumptions will be inefficient, poorly routed, or uneconomic.| Fit | Model / workload types |
|---|
How Livepeer routes by capability, not model
Livepeer intentionally avoids model marketplaces, model-branded APIs, and centralised catalogues. Instead, it routes by capability descriptors:image-to-imagevideo-to-videodepthsegmentationstyle-transfer
- Models can be swapped or updated without breaking downstream apps
- No vendor lock-in at the model layer
- Performance-based competition between orchestrators
- Apps never need direct knowledge of which model runs their job
Implementation patterns
Pattern A - Real-time diffusion
Best for style transfer, image-to-image, live video effects.- Hugging Face SD / SDXL weights
- StreamDiffusion or ComfyUI-style pipelines
- Frame-in → frame-out processing
- Persistent GPU residency
Pattern B - Vision utility node
Best for sub-tasks inside larger video pipelines.- Depth, segmentation, or pose models
- Extremely fast per-frame inference
- Used as conditioning steps feeding into diffusion
Pattern C - Hybrid pipeline
Best for differentiated orchestrator offerings.- Vision model output feeds conditioning into diffusion
- Vision → condition → generation chain
- Strong competitive differentiation in the marketplace
Hard constraints
Ignoring these will degrade routing priority and reduce job assignment:Setup
Build your inference server
You are packaging a server, not just a model.Typical stack:Your server is responsible for:
/infer(or equivalent) endpoint- Input validation
- GPU memory management
- Optional batching
- Warm start behaviour
Containerise the server
Build a Docker image that:
- Boots quickly
- Loads models deterministically
- Exposes a stable internal endpoint
Configure your node
Edit For a custom inference server, set the endpoint the orchestrator will proxy:
config.yaml:Start the gateway node
Register on-chain (optional)
Register your node on Arbitrum so gateways can discover you and route work automatically:Contract and ABI references: Contract Addresses
Pricing and discovery
- Set pricing per request, frame, or second
- Pricing is advertised off-chain
- Settlement occurs via Livepeer tickets
- Gateways discover and route to you automatically
- Applications never interact with Hugging Face or your orchestrator directly
See also
ComfyStream
ComfyUI-based pipelines for real-time video AI - node graphs, plugins, and gateway binding.
Workload Fit
Decide whether your model or use case belongs on Livepeer before you build.
Model Support
Full model family compatibility matrix for ComfyUI on Livepeer.
Hosting Models on Orchestrators
The operator-side view: how GPU node operators host and advertise models.