AI Building Blocks separate what you run (the Agent) from where it runs (the Provider). A Provider is a Unity ScriptableObject that performs inference through a Cloud, Local Server, or On-Device backend using the Unity Inference Engine.

A Provider handles provider-specific input/output handling and formatting to support the Agent’s focus on core logic. Agents therefore should be Provider-agnostic and work with any Provider that supports the same task through a task interface, for example IObjectDetectionTask.

Inference Types

Inference Type	Description	Typical Use
Cloud	Sends text, audio, or image payloads to a hosted model over HTTPS and returns results.	Fastest way to prototype using the latest models (LLMs, TTS/STT, DETR).
Local	Communicates with a model running on a local machine in the same Wi-Fi network (for example, Ollama).	Low latency, private demos, exhibitions.
On-Device	Runs the model directly on the headset via Unity Inference Engine.	Lowest latency, full privacy, and no network dependency. Significant performance impact.

Cloud Providers

Cloud Providers are the easiest way to get started: Just create a Provider asset, paste your API key, and enter the endpoint/model you want to use.

Always check provider and model availability

Providers and models may not always be available on the provider's servers. Always check provider and model availability before using them in your experience.

Provider (Asset)	Common Models / Capabilities	Editor Features
LlamaApiProvider	Official Meta Llama family (chat and multimodal variants)	Curated model list with automatic vision toggling.
OpenAIProvider	`gpt-5`, `gpt-4o` (chat/vision), `whisper-1` (STT), `tts-1`, `tts-1-hd` (TTS)	Model picker, Chat/Vision toggle, STT/TTS configuration foldouts.
HuggingFaceProvider	Any Hugging Face-hosted model (for example, `facebook/detr-resnet-101`, Llama family)	Token validator, endpoint health checker, image inlining options.
ReplicateProvider	Community-hosted models (`owner/model[:version]`)	Endpoint override, base64/data URI support, inline byte cap.
ElevenLabsProvider	Text-to-Speech and Speech-to-Text (Scribe)	Fetches voices, models, and metadata directly from your ElevenLabs account.

Setting Up a Cloud Provider

Create a Provider Asset: Create → Meta → AI → Provider Assets → <Cloud>/<Your Provider>

Enter API Key: Use the Get Key… button in the Inspector to open your provider’s developer portal.

Set Endpoint and Model: Copy these from the provider’s example curl request.

Click Validate / Check: Confirm authentication and connectivity on your provider’s website or the asset itself.

(Optional) Configure Vision Options: Adjust settings like Inline Remote Images, Resolve Redirects, and Max Inline Bytes.

Local Server Providers

Provider (Asset)	Backend	Description
OllamaProvider	Ollama daemon (`http://localhost:11434`)	Discovers installed models via `/api/tags` and lets you select local tags (for example, `llama3`, `llava:latest`, `gemma3`).

Connecting to Ollama

Run Ollama on your local machine:

ollama pull llama3

ollama serve

In Unity, open your OllamaProvider asset and configure:

Model Endpoint: http://localhost:11434/api/generate

Model: llama3 or another installed tag

Click Refresh Models to fetch available tags, then press Play to test the connection.

On-Device Providers

Provider (Asset)	Runtime	Highlights
UnityInferenceEngineProvider	Unity Inference Engine (Sentis)	Supports GPU/CPU backends, optional Split Over Frames for smoother performance, and GPU-based NMS compute shader integration.

Quick Setup

Create → Create → Meta → AI → Provider Assets → On-Device → Unity Inference Engine

Assign your model file (.onnx or .sentis)

Configure Backend: GPUCompute or CPU

Adjust Split Over Frames / Layers Per Frame for performance tuning

(Optional) Add Class Labels via a .txt file

(Optional) Enable GPU Non-Max Suppression (NMS) using the provided compute shader

For model conversion and performance tuning, see Unity Inference Engine. For more information about Unity Inference Engine, see Unity’s official documentation⁠.

Provider Selection During Building Block Installation

The RemoteProviderProfileRegistry automatically retrieves configuration files from Meta’s CDN containing official Provider profiles and defaults (endpoints, model names, and so on). When adding a new AI Building Block from Meta Hub → Building Blocks:

The installer detects all available inference types for that block.

It loads compatible Providers from the RemoteProviderProfileRegistry.

You choose your preferred inference type (Cloud, Local, or On-Device).

The selected Provider asset is saved with your prefab or component.

You can later edit it directly in the Inspector to update models, endpoints, or keys.

→ Next: Unity Inference Engine

Did you find this page helpful?