Skip to main content

Deploying Ocular (OSS)

This page walks the open-source install: load the OSS image, configure the environment, and make your first /classify call. The OSS image ships under Apache 2.0.

Enterprise customers: refer to the documentation packaged with your release. The commercial image adds a license layer, Console ingest, and additional operational surfaces not covered here.


What you need

  • A Linux host with a CUDA-capable NVIDIA GPU.
    • Needs ~6.4 GB VRAM (measured peak under a 200-turn trajectory plus concurrency); a 12 GB card runs it comfortably.
    • The 24 GB A10G-class card is the reference for the quoted latency, not a minimum.
  • ~50 GB free disk: image is ~10 GB compressed, ~34 GB unpacked, plus Docker overlay working set.
  • Docker Engine + Compose v2 (2.20+), and the NVIDIA Container Toolkit wired up as a Docker runtime.
  • The OSS platform tarball (ocular-platform-oss-<version>.tar.zst) and its SHA256 sidecar.

Before you start — host setup

# Docker Engine + Compose v2 (Docker's official repo)
# Follow https://docs.docker.com/engine/install/ for your distro.

# NVIDIA Container Toolkit (the `--gpus all` plumbing)
distribution=$(. /etc/os-release; echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list \
  | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

# Sanity check: GPU visible to Docker
sudo docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
# → should print GPU name + driver/CUDA version

zstd and tar are the only other host utilities needed for the tarball install path.

If you want to run docker without sudo, add your user to the docker group (sudo usermod -aG docker $USER) and re-login. The rest of this guide uses sudo docker throughout.


Install

The OSS platform tarball is a single docker load-able image archive plus a SHA256 sidecar. Verify before loading:

sha256sum -c ocular-platform-oss-<version>.tar.zst.sha256
# → ocular-platform-oss-<version>.tar.zst: OK

Decompress + load:

zstd -d -c ocular-platform-oss-<version>.tar.zst | sudo docker load
# → Loaded image: ocular-oss:<version>

The image tag printed by docker load is what customer-compose.yml references via ${OCULAR_VERSION}. Save it.


Configure environment

Copy the OSS env template alongside the compose file:

cp customer.env.example .env

Edit .env:

# Must match the tag baked into the image you loaded.
# e.g. if `docker images` shows ocular-oss:abc1234 — use that SHA.
OCULAR_VERSION=

That's the entire required surface — just OCULAR_VERSION, which selects the image you loaded.


Start and warm up

sudo docker compose --env-file .env -f customer-compose.yml up -d

The container takes 25-60 seconds to load model weights and warm up vLLM. Watch:

sudo docker compose -f customer-compose.yml logs -f ocular
# → "Loading heads from /models/heads/heads.pkl"
# → "Loaded 126 heads"
# → "Application startup complete"

Health probe:

curl -fsS http://localhost:8080/health
# → {"status":"ok","mode":"local","version":"abc1234","queue_depth":0}

If /health is 503 for the first ~60 seconds, that's normal warmup.

The bundle's customer-compose.yml binds the container to 127.0.0.1:8080 on the host by default, so the curl localhost:8080 check above works without any additional config. The 127.0.0.1: prefix means the service is reachable only from the host itself, not from other machines on the network. That's the safe default; Ocular does not authenticate per-request calls to /classify, so your network layer (reverse proxy, firewall, VPC) is the trust boundary.

To expose the service beyond the host, edit customer-compose.yml and change the ports: binding — e.g. "0.0.0.0:8080:8080" for all interfaces, or a specific interface IP. Front it with a reverse proxy if it will be reachable from untrusted networks.


Verify it works

curl -s -X POST http://localhost:8080/classify \
  -H 'Content-Type: application/json' \
  -d '{"messages":[{"role":"user","content":"I have been feeling really down"}]}' \
  | python3 -m json.tool

You should see a /classify response with salience, nested signals.{user,ai}, heads[], and meta.version matching your OCULAR_VERSION.

See api-reference for the full request/response contract, and risk-interpretation for what the salience score and per-axis levels mean.


Troubleshooting

nvidia-smi works on host but not in Docker

docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi returns "could not select device driver".

Cause: NVIDIA Container Toolkit isn't configured as a Docker runtime.

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

docker load fails with "no space left on device"

You don't have enough disk for the working set (image + tarball + Docker overlay fs ≈ 25 GB).

Fix: free at least 40 GB or mount /var/lib/docker on a larger disk.

Ocular /health returns 503 "Model still loading"

Normal during the first 25-60 s after container start. If it's been more than 3 minutes, check logs:

sudo docker compose -f customer-compose.yml logs ocular | tail -50

/classify is slow (>500 ms for single-message input)

  • GPU isn't idle during a request burst — nvidia-smi should show 15-90 % util. If near 0, the request is routing to CPU.
  • First request after idle takes 1-2 s for kernel dispatch caches to warm; steady-state is ~25-200 ms depending on input.
  • VRAM near max → memory pressure. Reduce input size or scale up GPU.

Tarball loaded but docker images shows no ocular-oss:... entry

docker load may have printed a different tag than expected. Re-verify the SHA on the tarball; if the SHA passes but docker images is wrong, the tarball may be from a different build variant — check the upstream release.


Operating the deployment

Logs

# All services, last N lines, follow:
sudo docker compose -f customer-compose.yml logs -f --tail=200

# Just Ocular:
sudo docker compose -f customer-compose.yml logs -f ocular

Restart

sudo docker compose -f customer-compose.yml restart ocular

Upgrade

# With the new platform tarball + SHA, verify
sha256sum -c ocular-platform-oss-<new-version>.tar.zst.sha256

# Load the new image (old image stays until removed):
zstd -d -c ocular-platform-oss-<new-version>.tar.zst | sudo docker load

# Update .env to point at the new SHA:
sed -i 's/^OCULAR_VERSION=.*/OCULAR_VERSION=<new-version>/' .env

# Recreate the container with the new image:
sudo docker compose --env-file .env -f customer-compose.yml up -d

# (Optional) reclaim disk:
sudo docker image rm ocular-oss:<old-version>

What to do next

  • api-reference — the universal /classify request / response contract
  • self-host-api — operational additions (/health, /manifest, Console push, env vars)
  • risk-interpretation — what the salience score and risk levels mean

Commercial support

The OSS image is the model under Apache 2.0 — free to use, modify, and redistribute. If you want calibration support, indemnification, ongoing model updates, incident response, or the keyed image with Console pairing, contact [email protected].


About Ocular's role

Ocular is a behavioral signal classifier. It surfaces linguistic and behavioral patterns in conversation text and returns signal strengths for those patterns. It does not assess, diagnose, or predict risk for any individual, and is not a substitute for clinical judgment, human review, or emergency services. Outputs are signals for routing and human triage; the operator sets thresholds and response policies appropriate to their platform and population.