> ## Documentation Index
> Fetch the complete documentation index at: https://docs.livepeer.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Add a Hugging Face Model to Livepeer

> Configure an existing Livepeer pipeline to serve a Hugging Face model. Declare the model, pre-download weights, restart the orchestrator, and verify end-to-end through your own self-hosted gateway.

export const TableCell = ({children, align = "left", header = false, style = {}, className = "", ...rest}) => {
  const Component = header ? "th" : "td";
  return <Component className={className} style={{
    padding: "0.75rem 1rem",
    textAlign: align,
    border: header ? "none" : "1px solid var(--lp-color-border-default)",
    ...style
  }} {...rest}>
      {children}
    </Component>;
};

export const TableRow = ({children, header = false, hover = false, style = {}, className = "", ...rest}) => {
  const rowId = `table-row-${Math.random().toString(36).substr(2, 9)}`;
  return <>
      {hover && <style>{`
          #${rowId}:hover {
            background-color: var(--lp-color-bg-card);
          }
        `}</style>}
      <tr id={rowId} className={className} style={{
    ...header && ({
      backgroundColor: "var(--lp-color-accent-strong)",
      color: "var(--lp-color-on-accent)",
      fontWeight: "bold"
    }),
    ...style
  }} {...rest}>
        {children}
      </tr>
    </>;
};

export const StyledTable = ({children, variant = "default", style = {}, className = "", ...rest}) => {
  const wrapperVariants = {
    default: {
      border: "1px solid var(--lp-color-border-default)",
      backgroundColor: "var(--lp-color-bg-card)",
      overflow: "hidden"
    },
    bordered: {
      border: "2px solid var(--lp-color-accent)",
      backgroundColor: "var(--lp-color-bg-page)",
      overflow: "hidden"
    },
    minimal: {
      border: "none",
      backgroundColor: "transparent",
      overflow: "visible"
    }
  };
  return <div data-docs-styled-table-shell className={className} style={{
    width: "100%",
    padding: 0,
    margin: 0,
    ...wrapperVariants[variant],
    ...style
  }} {...rest}>
      <table data-docs-styled-table style={{
    width: "100%",
    borderCollapse: "collapse",
    borderSpacing: 0,
    margin: 0,
    backgroundColor: "transparent"
  }}>
        {children}
      </table>
    </div>;
};

export const CustomDivider = ({color = "var(--lp-color-border-default)", middleText = "", spacing = "default", style = {}, className = "", ...rest}) => {
  const spacingPresets = {
    default: {
      margin: "24px 0"
    },
    overlap: {
      margin: "-1rem 0 -1rem 0"
    },
    tight: {
      margin: "0 0 -1rem 0"
    },
    section: {
      margin: "0 0 -2rem 0"
    },
    sectionOverlap: {
      margin: "-1rem 0 -2rem 0"
    },
    deepOverlap: {
      margin: "-1rem 0 -1.5rem 0"
    }
  };
  const spacingStyle = spacingPresets[spacing] || spacingPresets.default;
  return <div role="separator" aria-orientation="horizontal" className={className} style={{
    display: "flex",
    alignItems: "center",
    ...spacingStyle,
    fontSize: style?.fontSize || "16px",
    height: "fit-content",
    ...style
  }} {...rest}>
      <span style={{
    marginRight: "var(--lp-spacing-px-8)",
    opacity: 0.2
  }}>
        <Icon icon="/snippets/assets/logos/Livepeer-Logo-Symbol-Theme.svg" />
      </span>
      <div style={{
    flex: 1,
    height: "1px",
    background: "var(--lp-color-border-default)",
    opacity: 0.4
  }}></div>
      {middleText && <>
          <Icon icon="circle" size={2} />
          <span style={{
    margin: "0 8px",
    fontWeight: "bold",
    color: color,
    opacity: 0.7
  }}>
            {middleText}
          </span>
          <Icon icon="circle" size={2} />
        </>}
      <div style={{
    flex: 1,
    height: "1px",
    background: "var(--lp-color-border-default)",
    opacity: 0.4
  }}></div>
      <span style={{
    marginLeft: "var(--lp-spacing-px-8)",
    opacity: 0.2
  }}>
        <span style={{
    display: "inline-block",
    transform: "scaleX(-1)"
  }}>
          <Icon icon="/snippets/assets/logos/Livepeer-Logo-Symbol-Theme.svg" />
        </span>
      </span>
    </div>;
};

<Tip>
  Your Hugging Face model already fits one of the ten built-in Livepeer pipelines. You declare it, pre-download
  the weights, restart the Orchestrator with the AI flags, and verify through your own self-hosted Gateway. No
  Studio. No Daydream. No code written.
</Tip>

***

By the end of this tutorial, a Hugging Face model is running on your Livepeer Orchestrator, advertised to the
network, and callable through a Gateway you operate. The example model is
`SG161222/RealVisXL_V4.0_Lightning`, served through the `text-to-image` pipeline.

**What you will verify:**

* `aiModels.json` parses cleanly at Orchestrator startup
* The runner container loads the model into VRAM
* The model is advertised on `tools.livepeer.cloud/ai/network-capabilities`
* A request through your self-hosted Gateway returns a successful inference result

<CustomDivider />

## Scope and intent

This is the simplest path: your model conforms to one of the ten pipeline shapes the Livepeer AI worker
supports out of the box. The runner does the model loading, inference, and response formatting. You only
declare the model and the price.

This is the right tutorial if your model is, for example, an SDXL fine-tune, a BLIP variant, or a Whisper
variant. It is not the right tutorial if:

* your model needs custom Python code (preprocessing, postprocessing, novel architecture, or non-standard
  input or output shape). See the custom pipeline path.
* your model ships as an arbitrary container with its own protocol. See the BYOC path.
* your model is an LLM you want to run via Ollama instead of the standard `livepeer/ai-runner` image. The
  same overall flow applies but the runner image and `aiModels.json` entry differ. See the LLM variant note
  at the end.

<CustomDivider />

## Built-in pipelines

The Livepeer AI worker ships with a fixed set of pipeline implementations under
[`livepeer/ai-worker/runner/src/runner/pipelines/`](https://github.com/livepeer/ai-worker/tree/main/runner/src/runner/pipelines).
Each file defines the input schema, the output schema, and the model-loading conventions for one class of
inference task.

<StyledTable variant="bordered">
  <thead>
    <TableRow header>
      <TableCell header>Pipeline</TableCell>
      <TableCell header>Input</TableCell>
      <TableCell header>Output</TableCell>
      <TableCell header>Typical model class</TableCell>
    </TableRow>
  </thead>

  <tbody>
    <TableRow>
      <TableCell>`text-to-image`</TableCell>
      <TableCell>Text prompt + sampling params</TableCell>
      <TableCell>Image</TableCell>
      <TableCell>SDXL, SD 1.5, Lightning variants</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`image-to-image`</TableCell>
      <TableCell>Image + prompt + params</TableCell>
      <TableCell>Image</TableCell>
      <TableCell>SDXL img2img, ControlNet wrappers</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`image-to-video`</TableCell>
      <TableCell>Image + params</TableCell>
      <TableCell>Short video</TableCell>
      <TableCell>Stable Video Diffusion class</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`image-to-text`</TableCell>
      <TableCell>Image</TableCell>
      <TableCell>Caption text</TableCell>
      <TableCell>BLIP, captioning VLMs</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`audio-to-text`</TableCell>
      <TableCell>Audio bytes</TableCell>
      <TableCell>Transcript text</TableCell>
      <TableCell>Whisper variants</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`text-to-speech`</TableCell>
      <TableCell>Text + voice params</TableCell>
      <TableCell>Audio bytes</TableCell>
      <TableCell>TTS models (text in, audio out)</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`upscale`</TableCell>
      <TableCell>Image</TableCell>
      <TableCell>Higher-resolution image</TableCell>
      <TableCell>Diffusion upscalers</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`segment-anything-2`</TableCell>
      <TableCell>Image + prompt mask</TableCell>
      <TableCell>Segmentation mask</TableCell>
      <TableCell>SAM2 variants</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`llm`</TableCell>
      <TableCell>Chat messages</TableCell>
      <TableCell>Completion</TableCell>
      <TableCell>Ollama-supported LLMs</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`live-video-to-video`</TableCell>
      <TableCell>WebRTC stream</TableCell>
      <TableCell>WebRTC stream</TableCell>
      <TableCell>Real-time pipelines via ComfyStream</TableCell>
    </TableRow>
  </tbody>
</StyledTable>

If your model fits the input and output shape of one of these, take this tutorial. If not, the model needs
either a custom pipeline or a BYOC container.

<CustomDivider />

## Prerequisites

Each requirement is a hard prerequisite, not a soft one. Stop here if any is not in place.

<StyledTable variant="bordered">
  <thead>
    <TableRow header>
      <TableCell header>Requirement</TableCell>
      <TableCell header>Notes</TableCell>
    </TableRow>
  </thead>

  <tbody>
    <TableRow>
      <TableCell>Active Orchestrator on Arbitrum One</TableCell>
      <TableCell>Registered on-chain with a reachable `serviceAddr`, in the Active Set. Verify on `explorer.livepeer.org`.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>NVIDIA GPU with 24 GB VRAM minimum</TableCell>
      <TableCell>RealVisXL is an SDXL fine-tune. SDXL inference at fp16 needs roughly 12 GB for the UNet; 24 GB is the sensible floor with VAE, scheduler state, and warm-load headroom.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>Docker with NVIDIA Container Toolkit</TableCell>
      <TableCell>The AI worker runs each pipeline in a container with GPU passthrough. Verify: `docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi`.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`go-livepeer` build with AI worker mode</TableCell>
      <TableCell>Built from `master` or a release containing `-aiWorker`, `-aiModels`, and `-aiModelsDir` flags.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>Disk for model weights</TableCell>
      <TableCell>Fast disk, at least 50 GB free.</TableCell>
    </TableRow>
  </tbody>
</StyledTable>

<CustomDivider />

## Step 1: Choose the model directory

Pick a host path for model weights. The AI worker mounts this path into the runner container at `/models`.

```bash icon="terminal" title="export-model-dir.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
export LP_AI_MODELS_DIR=/data/livepeer-ai-models
mkdir -p "$LP_AI_MODELS_DIR"
```

This is the path you pass to `go-livepeer` via `-aiModelsDir`. The runner reads weights from `/models` inside
the container, which maps to this directory on the host.

<CustomDivider />

## Step 2: Declare the model in aiModels.json

Create an `aiModels.json` file. The Orchestrator parses this file at startup and advertises every pipeline it
lists.

```json icon="code" title="aiModels.json" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
[
  {
    "pipeline": "text-to-image",
    "model_id": "SG161222/RealVisXL_V4.0_Lightning",
    "price_per_unit": 4768371,
    "pixels_per_unit": 1,
    "currency": "wei",
    "warm": true
  }
]
```

Each field, grounded in the schema parsed by `go-livepeer`:

<StyledTable variant="bordered">
  <thead>
    <TableRow header>
      <TableCell header>Field</TableCell>
      <TableCell header>Definition</TableCell>
    </TableRow>
  </thead>

  <tbody>
    <TableRow>
      <TableCell>`pipeline`</TableCell>
      <TableCell>One of the canonical pipeline names (hyphenated form). Source: keys in `livePipelineToImage` in [`livepeer/go-livepeer/ai/worker/docker.go`](https://github.com/livepeer/go-livepeer/blob/master/ai/worker/docker.go).</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`model_id`</TableCell>
      <TableCell>The Hugging Face repository identifier as it appears in the URL `huggingface.co/<org>/<repo>`. Used by the runner as both the download target and the inference-routing key.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`price_per_unit` / `pixels_per_unit`</TableCell>
      <TableCell>Together set the rate. For pixel-priced pipelines, the rate is `price_per_unit / pixels_per_unit` wei per pixel. The wei figure is illustrative; set yours by comparing live rates on `tools.livepeer.cloud/ai/network-capabilities`.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`currency`</TableCell>
      <TableCell>`"wei"`. Settlement uses Arbitrum-native ETH denominated in wei.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`warm`</TableCell>
      <TableCell>`true` keeps the model in VRAM continuously, eliminating cold-start latency. `false` lazy-loads on first request, adding tens of seconds to the first job for SDXL-class models. Orchestrators competing on latency advertise warm models.</TableCell>
    </TableRow>
  </tbody>
</StyledTable>

<CustomDivider />

## Step 3: Pre-download the model weights

The model needs to land on disk before the runner starts. Otherwise warm load fails and lazy load stalls the
first request.

The canonical script in `livepeer/ai-worker` is
[`runner/dl_checkpoints.sh`](https://github.com/livepeer/ai-worker/blob/main/runner/dl_checkpoints.sh). It
reads pipeline names from environment variables, calls `huggingface_hub.snapshot_download` for each model, and
places weights at `$MODEL_DIR/<model_id>/`.

```bash icon="terminal" title="download-weights.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
git clone https://github.com/livepeer/ai-worker.git
cd ai-worker

docker run --rm \
  -v "$LP_AI_MODELS_DIR:/models" \
  -v "$(pwd)/runner:/runner" \
  -e MODEL_DIR=/models \
  -e PIPELINE=text-to-image \
  -e MODEL_ID=SG161222/RealVisXL_V4.0_Lightning \
  livepeer/ai-runner:latest \
  bash /runner/dl_checkpoints.sh
```

The command:

1. Mounts your host model directory at `/models` inside the container
2. Mounts the `runner/` directory so the script and helpers are available
3. Sets `MODEL_DIR=/models` so the script knows where to write
4. Sets `PIPELINE` and `MODEL_ID` so the script knows what to fetch
5. Runs the script, which uses `huggingface_hub` (already installed in the runner image) to pull the weights

Verify the download:

```bash icon="terminal" title="verify-weights.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
ls -la "$LP_AI_MODELS_DIR/SG161222/RealVisXL_V4.0_Lightning/"
```

Expect SDXL's standard layout: `model_index.json`, `unet/`, `vae/`, `text_encoder/`, `text_encoder_2/`,
`tokenizer/`, `tokenizer_2/`, `scheduler/`. If the directory is empty or partial, re-run the command.
`huggingface_hub` resumes partial downloads.

<CustomDivider />

## Step 4: Start the Orchestrator with the new model

Stop your existing `go-livepeer` Orchestrator and restart with the AI flags:

```bash icon="terminal" title="start-orchestrator.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
go-livepeer \
  -orchestrator \
  -transcoder \
  -nvidia all \
  -aiWorker \
  -aiModels /path/to/aiModels.json \
  -aiModelsDir "$LP_AI_MODELS_DIR" \
  -ethUrl <your-arbitrum-rpc> \
  -serviceAddr <your-public-host>:<port> \
  -pricePerUnit 0
```

The relevant flags, defined in
[`livepeer/go-livepeer/cmd/livepeer/livepeer.go`](https://github.com/livepeer/go-livepeer/blob/master/cmd/livepeer/livepeer.go):

<StyledTable variant="bordered">
  <thead>
    <TableRow header>
      <TableCell header>Flag</TableCell>
      <TableCell header>Purpose</TableCell>
    </TableRow>
  </thead>

  <tbody>
    <TableRow>
      <TableCell>`-aiWorker`</TableCell>
      <TableCell>Declares this node serves AI inference jobs. Without this flag, even a perfectly configured `aiModels.json` is ignored.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`-aiModels`</TableCell>
      <TableCell>Path to your `aiModels.json` file.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`-aiModelsDir`</TableCell>
      <TableCell>The host directory you populated in Step 3. Mounted into runner containers at `/models`.</TableCell>
    </TableRow>

    <TableRow>
      <TableCell>`-nvidia all`</TableCell>
      <TableCell>GPU exposure for both transcoding and AI workers. Use a GPU index (for example `-nvidia 0`) to pin AI to a specific card.</TableCell>
    </TableRow>
  </tbody>
</StyledTable>

At startup, `go-livepeer`:

1. Parses `aiModels.json`
2. For each entry with `warm: true`, looks up the runner image from the pipeline-to-image map in `livepeer/go-livepeer/ai/worker/docker.go`, pulls it if absent, and starts a container
3. Mounts `$LP_AI_MODELS_DIR` into the container at `/models`
4. Waits for the runner's `/health` endpoint to report ready
5. Begins advertising the pipeline plus model plus price as a capability

Watch the logs. A successful warm load looks like a runner-container start, a model-load log line, and a
"capability advertised" or equivalent message. Source for the runner's health and readiness contract:
[`livepeer/ai-worker/runner/src/runner/main.py`](https://github.com/livepeer/ai-worker/blob/main/runner/src/runner/main.py)
(FastAPI app definition).

<CustomDivider />

## Step 5: Verify on the network capabilities tool

Open [`tools.livepeer.cloud/ai/network-capabilities`](https://tools.livepeer.cloud/ai/network-capabilities) in
a browser. This dashboard reads live capability advertisements from active Orchestrators on the network.

Find your Orchestrator address. You should see:

* the `text-to-image` pipeline listed under your Orchestrator
* `SG161222/RealVisXL_V4.0_Lightning` listed under that pipeline
* a warm indicator, if the dashboard surfaces it

If your Orchestrator is not in the list, the model is not visible to the network. The three usual causes:

<AccordionGroup>
  <Accordion title="Orchestrator not in the active set" icon="circle-exclamation">
    Confirm on [`explorer.livepeer.org`](https://explorer.livepeer.org) that your address shows as active.
    Capability advertisement requires on-chain registration with sufficient stake.
  </Accordion>

  <Accordion title="Runner container failed to start" icon="docker">
    Check `docker ps -a` for an exited container, then `docker logs <container-id>` for the failure reason.
    The most common is CUDA out-of-memory at warm load.
  </Accordion>

  <Accordion title="aiModels.json did not parse" icon="code">
    `go-livepeer` was started without `-aiWorker`, or `aiModels.json` did not parse. Check the Orchestrator
    startup logs for parse errors.
  </Accordion>
</AccordionGroup>

Resolve any of these before continuing.

<CustomDivider />

## Step 6: Send a test inference request

Two paths verify the model end-to-end without touching Studio or Daydream. Use both in order: localhost first,
Gateway second.

### Step 6a: Hit the runner directly on localhost

The runner is a FastAPI service. Source:
[`livepeer/ai-worker/runner/src/runner/main.py`](https://github.com/livepeer/ai-worker/blob/main/runner/src/runner/main.py).
The Orchestrator runs it on a port internal to the host (printed in startup logs as the AI worker port).

```bash icon="terminal" title="runner-direct.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
curl -X POST http://localhost:<runner-port>/text-to-image \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "SG161222/RealVisXL_V4.0_Lightning",
    "prompt": "a quiet harbour at dawn, photo realistic",
    "width": 1024,
    "height": 1024,
    "num_inference_steps": 4,
    "guidance_scale": 2.0
  }' \
  --output result.json
```

The four-step inference and low guidance scale follow the SDXL Lightning recommendations on the model card at
[`huggingface.co/SG161222/RealVisXL_V4.0_Lightning`](https://huggingface.co/SG161222/RealVisXL_V4.0_Lightning).

A successful response is a JSON object with an `images` array. Each image is base64-encoded or referenced by
URL depending on runner version. Decode and inspect the output:

```bash icon="terminal" title="inspect-output.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
jq -r '.images[0].url // .images[0]' result.json | head -c 200
```

This step confirms the model is loaded and inference works. It does not confirm that the model is reachable
through the Livepeer Network. That is Step 6b.

### Step 6b: Self-hosted Gateway test

`go-livepeer` runs as a Gateway when started with `-gateway`. On a separate process or machine:

```bash icon="terminal" title="start-gateway.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
go-livepeer \
  -gateway \
  -httpAddr 0.0.0.0:8935 \
  -orchAddr <your-orch-host>:<port> \
  -ethUrl <your-arbitrum-rpc>
```

The `-orchAddr` flag pins discovery to your own Orchestrator, removing the variability of network-wide
selection. This is what makes the test deterministic: the Gateway can only route to your node.

Then send the inference request to the Gateway:

```bash icon="terminal" title="gateway-request.sh" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
curl -X POST http://localhost:8935/text-to-image \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "SG161222/RealVisXL_V4.0_Lightning",
    "prompt": "a quiet harbour at dawn, photo realistic",
    "width": 1024,
    "height": 1024,
    "num_inference_steps": 4,
    "guidance_scale": 2.0
  }' \
  --output gateway-result.json
```

The Gateway handles discovery, capability matching, ticket-based payment, and routing to the Orchestrator.
The response includes the inference output and a settlement record for the probabilistic micropayment ticket.
A successful response means your model is reachable across the protocol layer through your own infrastructure.

<Note>
  The Livepeer Cloud Community Gateway is a free public Gateway maintained by the Cloud SPE (Titan Node).
  Sending a request to it tests routing from outside your own infrastructure. The downside is non-determinism:
  it selects an Orchestrator from the Active Set and may not select yours. Use it only as a cross-check after
  Step 6b succeeds, never as the primary verification.
</Note>

<CustomDivider />

## Step 7: Confirm the loop is closed

The tutorial is complete when all four are observable:

1. `aiModels.json` declares the model and `go-livepeer` parsed it cleanly at startup (Orchestrator logs)
2. The runner container is running and the model is loaded into VRAM (`docker ps`, `nvidia-smi`)
3. The Orchestrator advertises the model on `tools.livepeer.cloud/ai/network-capabilities`
4. A request through your self-hosted Gateway returns a successful inference result

If any one of these is missing, the model is not yet on the network. Resolve before relying on the path for
paid traffic.

<CustomDivider />

## Operational notes

<AccordionGroup>
  <Accordion title="Pricing" icon="coins">
    Setting price-per-pixel above the network median means your Orchestrator receives no jobs. Gateway
    selection in `go-livepeer` filters by price competitiveness. Compare against the rates visible on the
    network capabilities dashboard before going live.
  </Accordion>

  <Accordion title="Warm and cold trade-off" icon="temperature-half">
    `warm: true` holds the model in VRAM continuously. SDXL-class models occupy roughly 12 GB; on a 24 GB card
    you can warm one SDXL plus, perhaps, a smaller pipeline like `image-to-text` (4 GB floor per
    `Salesforce/blip-image-captioning-large`) but not two SDXL variants. Cold models (`warm: false`) share
    VRAM via swap on first request; price them lower because the cold-start latency makes them less attractive
    to Gateways.
  </Accordion>

  <Accordion title="Same flow, different model" icon="arrows-rotate">
    Replace the `model_id` in `aiModels.json` and the `MODEL_ID` in the download command with your chosen
    model. The pipeline name stays the same as long as the model fits the same I/O shape. For example,
    swapping `SG161222/RealVisXL_V4.0_Lightning` for `ByteDance/SDXL-Lightning` (also a `text-to-image` model)
    requires no other changes.
  </Accordion>
</AccordionGroup>

<CustomDivider />

## LLM variant via Ollama

LLM models follow the same overall flow but use a different runner image. The Cloud SPE maintains
[`tztcloud/livepeer-ollama-runner`](https://hub.docker.com/r/tztcloud/livepeer-ollama-runner), which wraps
Ollama for OpenAI-compatible completions.

The `aiModels.json` entry for an LLM:

```json icon="code" title="aiModels-llm.json" theme={"theme":{"light":"github-light","dark":"dark-plus"}}
{
  "pipeline": "llm",
  "model_id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
  "price_per_unit": 1,
  "pixels_per_unit": 1000000,
  "currency": "wei",
  "warm": true
}
```

The model identifier is the Hugging Face repo for documentation purposes; the actual model pull happens
through Ollama's tagging system (`ollama pull llama3.1:8b`) inside the Ollama runner container. The mapping
between HF identifier and Ollama tag for each LLM is the only piece that does not generalise from the
standard runner. Reference: the Ollama tag library at [`ollama.com/library`](https://ollama.com/library).

Otherwise the pattern is identical: declare in `aiModels.json`, ensure the runner image is available, restart
`go-livepeer`, verify on the capabilities tool, test through your self-hosted Gateway with an
OpenAI-compatible chat completion request.

<CustomDivider />

## Troubleshooting

<AccordionGroup>
  <Accordion title="Runner container exits immediately" icon="bug">
    Run `docker logs <container-id>`. Three common causes: model files missing or partial (re-run Step 3);
    CUDA out-of-memory at load (insufficient VRAM, downgrade to `warm: false` or pick a smaller variant);
    image pull failed (check Docker Hub connectivity).
  </Accordion>

  <Accordion title="Orchestrator absent from capabilities tool but runner loaded" icon="signal">
    Check [`explorer.livepeer.org`](https://explorer.livepeer.org) that your Orchestrator is in the active
    set. Capability advertisement requires on-chain registration with sufficient stake.
  </Accordion>

  <Accordion title="Localhost works but gateway fails" icon="globe">
    Confirm `serviceAddr` is reachable from outside your network. Open the relevant port at the firewall,
    confirm DNS, and confirm the Orchestrator is binding to a public interface instead of `localhost`.
  </Accordion>

  <Accordion title="Inference returns low-quality output" icon="image">
    Check that you are using the SDXL Lightning recommended sampling (4 steps, low guidance). Different SDXL
    fine-tunes have different recommended schedulers and step counts. Consult the model card.
  </Accordion>
</AccordionGroup>

<CustomDivider />

## Sources

Every claim in this tutorial is grounded in one of the following readable references:

<AccordionGroup>
  <Accordion title="Livepeer source" icon="github">
    * [`github.com/livepeer/ai-worker`](https://github.com/livepeer/ai-worker) – runner architecture, pipeline implementations, `dl_checkpoints.sh`
    * [`livepeer/ai-worker/runner/src/runner/pipelines`](https://github.com/livepeer/ai-worker/tree/main/runner/src/runner/pipelines) – supported pipeline list and their I/O shapes
    * [`livepeer/ai-worker/runner/dl_checkpoints.sh`](https://github.com/livepeer/ai-worker/blob/main/runner/dl_checkpoints.sh) – model download script, environment variables, HF integration
    * [`livepeer/ai-worker/runner/src/runner/main.py`](https://github.com/livepeer/ai-worker/blob/main/runner/src/runner/main.py) – FastAPI app, `/health` endpoint, port binding
    * [`github.com/livepeer/go-livepeer`](https://github.com/livepeer/go-livepeer) – Orchestrator, Gateway, AI worker mode
    * [`livepeer/go-livepeer/cmd/livepeer/livepeer.go`](https://github.com/livepeer/go-livepeer/blob/master/cmd/livepeer/livepeer.go) – flag definitions for `-aiWorker`, `-aiModels`, `-aiModelsDir`, `-gateway`, `-orchAddr`, `-serviceAddr`
    * [`livepeer/go-livepeer/ai/worker/docker.go`](https://github.com/livepeer/go-livepeer/blob/master/ai/worker/docker.go) – pipeline-to-image map keyed on canonical pipeline name strings
  </Accordion>

  <Accordion title="External source" icon="link">
    * [`huggingface_hub`](https://github.com/huggingface/huggingface_hub) – `snapshot_download` semantics
    * [`huggingface.co/SG161222/RealVisXL_V4.0_Lightning`](https://huggingface.co/SG161222/RealVisXL_V4.0_Lightning) – model card, recommended sampling
    * [`hub.docker.com/r/livepeer/ai-runner`](https://hub.docker.com/r/livepeer/ai-runner) – runner image, tags
    * [`hub.docker.com/r/tztcloud/livepeer-ollama-runner`](https://hub.docker.com/r/tztcloud/livepeer-ollama-runner) – Ollama-based LLM runner
    * [`tools.livepeer.cloud/ai/network-capabilities`](https://tools.livepeer.cloud/ai/network-capabilities) – live capability dashboard
    * [`explorer.livepeer.org`](https://explorer.livepeer.org) – Orchestrator active-set status
  </Accordion>
</AccordionGroup>

<CustomDivider />

Your model is now running on the Livepeer Network, advertised to Gateways, and callable through your self-hosted Gateway. For custom architectures that do not fit a native pipeline, see the [advanced paths](/v2/developers/build/tutorials/huggingface-to-livepeer-advanced).

## AI agent prompt

```text theme={"theme":{"light":"github-light","dark":"dark-plus"}}
Complete the "Add a Hugging Face Model to Livepeer" tutorial for a model that fits an existing Livepeer AI pipeline. Use placeholders for MODEL_ID=<huggingface org/repo>, PIPELINE=<canonical pipeline name>, LP_AI_MODELS_DIR=/data/livepeer-ai-models, ORCH_SERVICE_ADDR=<orchestrator service address>, ORCH_ETH_ADDR=<orchestrator ETH address>, GATEWAY_PORT=8935, and ORCH_ADDR=<orchestrator address>. Clone livepeer/ai-worker only for the checkpoint script, use livepeer/ai-runner images, write aiModels.json, pre-download weights, start go-livepeer with -aiWorker -aiModels -aiModelsDir, verify the runner container and tools.livepeer.cloud capability listing, then start a self-hosted go-livepeer -gateway pinned to the orchestrator and send a test inference request. Do not use Studio or Daydream.
```

<CustomDivider />

## Related pages

<CardGroup cols={2}>
  <Card title="Advanced HuggingFace paths" icon="code-branch" href="/v2/developers/build/tutorials/huggingface-to-livepeer-advanced" arrow horizontal>
    Three structurally different paths: existing pipeline, custom pipeline, BYOC.
  </Card>

  <Card title="Full AI Pipeline Tutorial" icon="diagram-project" href="/v2/orchestrators/guides/tutorials/full-ai-pipeline-tutorial" arrow horizontal>
    Local end-to-end pipeline: Gateway routes inference to Orchestrator and the result returns through the full pipeline.
  </Card>

  <Card title="Realtime AI Tutorial" icon="video" href="/v2/orchestrators/guides/tutorials/realtime-ai-tutorial" arrow horizontal>
    Live video-to-video pipeline: continuous WebRTC stream in, transformed stream out.
  </Card>

  <Card title="ComfyStream Quickstart" icon="bolt" href="/v2/developers/build/ai-and-agents/realtime-ai/comfystream/comfystream-quickstart" arrow horizontal>
    Stand up a ComfyStream pipeline for real-time AI workloads.
  </Card>
</CardGroup>
