Skip to main content

Console

Console is a lightweight browser UI that surfaces the scored conversations Ocular processes, lets you configure watchlists, and fires webhooks when watchlisted users or agents cross thresholds. It's optional — if your integration is purely programmatic (your app calls /classify, stores results in your own pipeline), you don't need Console at all.

Capabilities:

  • Browser UI for session inspection — list, filter by verdict, drill into per-turn signals.
  • Watchlist rules + webhook delivery on match (configurable via UI or /api/watchlists).
  • Rule export (GET /api/watchlists?format=export) for evaluation outside Console.

Architecture

  • CPU-only. No GPU needed. Runs alongside Ocular in the same compose stack.
  • SQLite-backed. One file, configurable retention (default 7 days — Console is a convenience, not long-term storage).
  • Receives pushes from Ocular. When your app passes log: true to /classify (along with session_id + user_id), Ocular calls Console's /api/ingest after scoring. The push is intentionally lossy — if Console is down, Ocular still scores; you just don't see the session in the UI. Calls without log: true are never pushed.
  • No authentication at the network layer. Same trust model as Ocular — deploy it behind your own reverse proxy with SSO, a VPN, or a Cloudflare Tunnel + Access app (see deployment.md §7).

Shared-GPU contention with your app

Console has two ingest paths — only one of them contends with your app for GPU:

  • Conduit push (log: true from Ocular). Your app calls /classify, Ocular scores on the GPU, then pushes the already-scored result to Console. No re-scoring on Console's side, so this path doesn't use the GPU at all. Runs freely alongside your app's inline traffic.
  • Direct /api/ingest without a pre-scored body. When the caller posts messages[] but no ocular body (sandbox testing, log backfill against unscored transcripts), Console calls Ocular itself to score. This path is what contends with your app's GPU.

The direct-ingest path runs in trajectory mode (per-turn) and holds the GPU serially; your app's inline /classify calls batch-coalesce separately. They can't interleave — every in-flight trajectory ingest blocks the batch queue from firing until it finishes.

Empirical impact on a 24 GB datacenter-class GPU: two concurrent direct ingests running against the same box roughly halve your app's inline scoring throughput and roughly double its p50 latency while the ingests are in flight. Larger cards see a smaller relative effect; the shape is the same.

Three mitigations for the direct-ingest case, least to most change:

  1. Lower direct-ingest concurrency. If you're driving unscored ingest through a batch job or backfill, pace it at 1 concurrent request instead of multiple. Trades drain time for app-latency stability.
  2. Split Console onto its own box. Ocular on box A serves your app; Ocular on box B serves Console's direct ingest. Console (CPU only) lives on either box or a third. See integration-patterns.md §"Pattern 6 — Fleet (scaled)" for the shape.
  3. Delegate Console's scoring to a remote endpoint. Point Console's OCULAR_URL env var at a remote Ocular endpoint (a separate deployment, or one you host elsewhere) instead of the local http://ocular:8080/classify. Your on-prem GPU is reserved for app traffic; Console's direct-ingest scoring goes elsewhere at remote-call latency.

If Console only receives conduit pushes (no direct-ingest load), none of the above applies — conduit ingests never hit the GPU.


Starting Console

Console is included in the platform tarball and runs with the same compose file as Ocular, under a console profile:

docker compose -f customer-compose.yml --profile console up -d

Skip --profile console to run Ocular without Console.

Console listens on port 3950 by default (change by editing customer-compose.yml). Sanity check:

curl -fsS http://localhost:3950/api/health | jq .
# → {"status":"ok","sessions":0}
curl -fsS 'http://localhost:3950/api/health?deep=true' | jq .
# → {"status":"ok","db":"ok","ocular":"ok","sessions":0}

The deep=true variant also verifies Console can talk to Ocular — it's a good first-deploy sanity check.


The UI

Top-level pages (all at http://<console-host>:3950/):

Path Purpose
/ Dashboard — recent sessions, current crisis counts, quick filters.
/sessions All sessions, sortable + filterable by crisis level, user, agent, date.
/users Users who've produced sessions. Drill into per-user history.
/users/[id] Single user's session history + crisis trajectory over time.
/agents AI agents (by agent_id).
/agents/[id] Single agent's behaviour metrics. Useful for "is this agent drifting?"
/watchlists Configure watchlist rules + webhooks.
/logs Audit log of watchlist matches and webhook deliveries.
/performance Throughput + queue depth + recent latency histograms.
/settings Retention window, webhook defaults, CSV export.
/diagnostics Package Console's scored data into an NDJSON snapshot for NOPE to analyse. Full write-up in diagnostics.md.

Keyboard shortcut: ? on any page shows keybindings.


Getting sessions into Console

Two ways.

Conduit push (the primary path)

Your app calls Ocular's /classify with session_id, user_id, and log: true (plus optionally agent_id and messages):

{
  "messages": [{"role": "user", "content": "I don't know how much longer..."}],
  "session_id": "conv-42",
  "user_id": "u-1234",
  "agent_id": "bot-main",
  "log": true
}

Ocular scores the message and fires-and-forgets a POST to Console's /api/ingest with the scored result (plus the messages, if you provided them — Console uses them to render transcripts). The session appears in Console within a few hundred milliseconds.

The push is gated on log: true — calls without it are never pushed, regardless of whether session_id + user_id are present. This is opt-in by design: production traffic at scale is unlikely to want every scored call mirrored into Console.

On agent_id. Optional. Include it if you want the session to appear under /agents/<agent_id> in Console — that's how the agent drill-down, per-agent metrics, and agent-scoped watchlists find the session. Sessions without agent_id are still logged and searchable but don't participate in agent-breakdown views.

If OCULAR_CONSOLE_URL is unset on the Ocular container or Console is down, the push is silently skipped. Ocular still returns the score to your caller — the conduit is a side-effect, not a dependency.

Direct POST to /api/ingest

You can also ingest externally-scored sessions directly — useful for backfilling from logs:

curl -s -X POST http://localhost:3950/api/ingest \
  -H 'Content-Type: application/json' \
  -d '{
    "session_id": "external-1",
    "user_id": "u-9",
    "agent_id": "manual",
    "messages": [{"role": "user", "content": "..."}],
    "ocular": { ... full /classify v1 response shape ... }
  }'

Or, to let Console call Ocular itself (useful for sandbox testing — Console will score the messages through its configured OCULAR_URL):

curl -s -X POST http://localhost:3950/api/ingest \
  -H 'Content-Type: application/json' \
  -d '{
    "session_id": "external-2",
    "user_id": "u-9",
    "messages": [{"role": "user", "content": "..."}]
  }'

Useful for backfilling from logs, or if you're scoring with a model other than Ocular but want to inspect results in the Console UI.

Concurrent /api/ingest calls are coalesced into one fsync per event-loop tick by an in-process group-commit coordinator, so you don't need to batch yourself to get good throughput. If you sustainedly exceed the queue's high watermark Console returns 429 Too Many Requests with a Retry-After header — honor it (or switch to /api/ingest/batch below) instead of hot-looping the retry.

Batch ingest (POST /api/ingest/batch)

For high-volume backfill or replay where you can buffer sessions client- side, post up to 500 pre-scored sessions per call. They're written in a single SQLite transaction; per-item failures are isolated via SAVEPOINT and reported in the response array — one bad payload doesn't fail the batch.

curl -s -X POST http://localhost:3950/api/ingest/batch \
  -H 'Content-Type: application/json' \
  -d '{
    "sessions": [
      {"session_id": "s1", "user_id": "u1", "ocular": { ... }},
      {"session_id": "s2", "user_id": "u2", "ocular": { ... }},
      ...
    ]
  }'
# → {"count": 2, "succeeded": 2, "failed": 0, "results": [{"ok": true, ...}, ...]}

Pre-scored only — every entry must include ocular. See deployment.md § "Console at scale" for tunables and sizing.


Session detail view

Clicking a session shows:

  • Header. Verdict, subject, imminence, timestamp, user/agent IDs.
  • Risk breakdown. Per-axis scores (risks.suicide, risks.harm_to_others, …, ai_concerns.safeguarding_failure, etc.) with level labels drawn from the 5-value domain (minimalcritical).
  • Signals. The ranked, screening-filtered signals[] list — opaque signal_NNNN identifiers with calibrated scores.
  • Trajectory. If the session was scored with per_turn: true, a per-turn plot showing how each risk axis evolved across the conversation, with the per-turn verdict transitions highlighted. The caller sets per_turn: true on the original /classify call — the conduit push forwards whatever the caller requested; without per_turn, Console has no trajectory to render.
  • Turns. The conversation itself, with each turn's individual signal summary inline (if trajectory data was included).
  • Provider content. If the caller pushed messages alongside the scored result (i.e. used the conduit push with log: true), Console renders transcripts inline from its own DB. Otherwise, if you configured PROVIDER_URL=http://your-app/... in customer.env, Console will fetch the full message content from your app's transcript service for display. Without either, the turn list shows "no content available" — Ocular doesn't store raw message text.

Console's JSON API

Programmatic consumers (verification scripts, dashboards, your own eval pipelines) can read Console's stored sessions without the browser UI.

GET /api/sessions/<session_id>

Full detail for one session.

{
  "session_id": "...",
  "user_id": "...",
  "agent_id": "...",
  "scored_at": 1776327408,
  "ingested_at": 1776327408,
  "message_count": 90,
  "turns": [
    {"turn": 0, "role": "user",      "content": "..."},
    {"turn": 1, "role": "assistant", "content": "..."}
  ],
  "ocular": { /* the verbatim /classify response — see api-reference.md */ }
}

ocular is the full /classify body; drill in via ocular.verdict, ocular.risks.<axis>.score, ocular.fiction, etc. Returns 404 if session_id is unknown.

GET /api/sessions

Lists stored sessions. Response: { total, sessions: [ ... ] } where each session is the same shape as the detail endpoint.

Query param Values Default
verdict clear | watch | danger (no filter)
min_crisis 0..1 0
sort promoted column name (e.g. crisis_score, scored_at, suicide_risk, max_user_risk, fiction_strength) crisis_score
order asc | desc desc
limit ≤ 200 50
offset ≥ 0 0

DELETE /api/sessions/<session_id>

Removes a session row. Retention is usually driven by RETENTION_DAYS; DELETE is for ad-hoc cleanup.


Watchlists

A watchlist is a rule + a destination. When a matching session comes in, Console records a "match" and, if configured, fires a webhook.

Anatomy of a rule

{
  "id": "high-confidence-suicide",
  "name": "Suicide risk (high+) with corroboration",
  "scope": "session",
  "conditions": {
    "requiresAll": [
      {"code": "risks.suicide.level", "operator": "in", "value": ["high", "critical"]},
      {"code": "risks.suicide.corroboration.level", "operator": "!=", "value": "absent"}
    ],
    "requiresAny": [
      {"code": "verdict", "operator": "==", "value": "danger"}
    ]
  },
  "webhookUrl": "https://alerts.example.com/hook/foo",
  "timeWindow": 86400,
  "enabled": true
}

A rule is a conditions block (with requiresAll[] + requiresAny[] arrays) against the canonical field surface from the Ocular v1 response. A rule fires when every condition in requiresAll matches AND (when requiresAny is non-empty) at least one condition in requiresAny matches. A rule with both arrays empty is rejected at write-time — it would match every session.

Naming convention. Rule metadata keys (webhookUrl, timeWindow, scope, enabled) are camelCase. Condition code paths and the webhook body payload (session_id, crisis_score, matched_scores) are snake_case — they mirror the Ocular response surface directly.

Condition shape

Each condition is {code, operator, value}:

  • code — a namespaced path into the Ocular response. Enumerated, not free-form. Ask GET /api/watchlists?format=export for the full list; a selection:

    Path Type Values
    verdict enum clear | watch | danger
    subject enum self | other | unknown
    risks.<axis>.score number 0..1. Axes: suicide, self_harm, harm_to_others, abuse, sexual_violence, exploitation, stalking, self_neglect
    risks.<axis>.level enum minimal | low | moderate | high | critical
    ai_concerns.<axis>.score number 0..1. Axes: harm_provision, emotional_failure, manipulation, safeguarding_failure
    ai_concerns.<axis>.level enum same 5-value domain
    imminence.score / imminence.level number / enum same
    fiction / authenticity number 0..1
    risks.<axis>.corroboration.level enum absent | limited | moderate | strong (4 user axes carry corroboration: suicide, self_harm, harm_to_others, abuse)

Path naming note. In the /classify response body, corroboration lives at detail.corroboration.<axis>.strength (key is strength). The watchlist schema exposes the same data as risks.<axis>.corroboration.level — a flattened path with level as the key. Watchlist paths are a Console-side naming vocabulary, not a direct mirror of the Ocular response shape.

  • operator — for numeric paths: >=, >, <=, <. For enum paths: ==, !=, in, not in.

  • value — a number (numeric ops), a string (== / !=), or an array of strings (in / not in). Enum values are validated against the field's domain at rule-write time.

Scopes and dedup

  • session — evaluate once per (watchlist, session).
  • message — evaluate once per turn in trajectory[]. Only verdict varies per-turn; everything else resolves to the session-level value.
  • user — evaluate once per (watchlist, user) within timeWindow seconds. Suppresses repeat pages for the same person; the first match in the window wins.

What's not supported (by design)

  • Raw head-score conditions. Watchlists subscribe to outcomes, not to model internals. Conditions like code: "USER_SUICIDAL_IDEATION" or code: "signal_0042" are rejected at write-time. If you need per-head rules, request detail: true on /classify and run your own logic in your app — don't push head semantics into the alerting layer.

Edit watchlists in the UI at /watchlists, or drive the /api/watchlists endpoint directly (required when provisioning rules from CI). All mutating verbs require CONSOLE_MODE=full.

Verb Path Purpose
GET /api/watchlists List active watchlists.
GET /api/watchlists?format=export Portable export: canonical fields[] list + operator/domain catalog + all rules. Use this to version-control rules or feed your own engine.
POST /api/watchlists Upsert a single rule. Body: {id, name, scope, conditions, webhookUrl?, enabled?, timeWindow?}.
PUT /api/watchlists Bulk import. Body: {watchlists: [...]}. Validates all rows up-front; rejects the whole batch on any error so partial imports don't silently drop rules.
PATCH /api/watchlists Toggle enabled. Body: {id, enabled: boolean}.
DELETE /api/watchlists Remove a rule. Body: {id}.

Webhook delivery

When a session matches, Console POSTs to webhook_url with:

{
  "event": "watchlist_match",
  "timestamp": "2026-04-19T12:34:56.789Z",
  "watchlist": "High-risk user watch",
  "session_id": "conv-42",
  "user_id": "u-123",
  "agent_id": "bot-main",
  "crisis_score": 0.847,
  "matched_scores": { "...": "per-axis scores at the time of the match" }
}

crisis_score is Console's internal proxy for "how bad is this?" — computed from the max user-side risk score. matched_scores carries the per-axis snapshot (risks.suicide.score, etc.) used by the rule at match time, so the receiving side has enough context to act without calling back to Ocular.

Webhooks are delivered from an in-process outbox with at-least-once semantics: if your endpoint returns non-2xx, Console retries with exponential backoff (up to ~1 hour before giving up). Duplicate matches within ~60 seconds for the same session + watchlist are deduplicated.


Exporting watchlists for your own pipeline

Console is great for prototyping, but at scale you probably want to run rules in your own infrastructure. GET /api/watchlists?format=export returns a portable JSON representation:

curl 'http://localhost:3950/api/watchlists?format=export' > watchlists.json

This can be version-controlled and evaluated in your own rules engine. Console remains useful for session inspection even after you move rule evaluation out.


Retention

Console is not a system of record. Two retention axes apply; whichever evicts first wins.

  • Time (RETENTION_DAYS, default 7). Hourly sweep deletes anything older than the window — sessions, audit log, watchlist matches, delivered webhook outbox.
  • Size (RETENTION_MAX_GB, default unlimited). 60s watchdog evicts oldest sessions FIFO when the .db + WAL file exceeds the cap, then truncates the WAL and runs incremental_vacuum so the file actually shrinks on disk. Use this as a hard stop below your volume's quota so Console ages out old data instead of hitting disk full and losing writes.

Both are configurable at runtime through Settings → System or PATCH /api/settings ({retentionDays, retentionMaxGB}); the DB-row override sticks across restarts.

If you want a durable record of every scoring event, read the /classify response in your own app and persist it however you normally persist application data. Don't treat Console as a warehouse.


When to skip Console entirely

Run Ocular without Console if:

  • You already have a dashboard or pipeline you want to feed scoring signals into.
  • Your integration is fully programmatic — your app's code calls /classify, stores results, and handles alerting.
  • You want the minimum possible attack surface on the deploy box.

In that case: skip --profile console when running compose. The Console image is still in the tarball but nothing starts. Zero cost.