Console
Console is a lightweight browser UI that surfaces the scored conversations
Ocular processes, lets you configure watchlists, and fires webhooks when
watchlisted users or agents cross thresholds. It's optional — if your
integration is purely programmatic (your app calls /classify, stores
results in your own pipeline), you don't need Console at all.
Capabilities:
- Browser UI for session inspection — list, filter by verdict, drill into per-turn signals.
- Watchlist rules + webhook delivery on match (configurable via UI or
/api/watchlists). - Rule export (
GET /api/watchlists?format=export) for evaluation outside Console.
Architecture
- CPU-only. No GPU needed. Runs alongside Ocular in the same compose stack.
- SQLite-backed. One file, configurable retention (default 7 days — Console is a convenience, not long-term storage).
- Receives pushes from Ocular. When your app passes
log: trueto/classify(along withsession_id+user_id), Ocular calls Console's/api/ingestafter scoring. The push is intentionally lossy — if Console is down, Ocular still scores; you just don't see the session in the UI. Calls withoutlog: trueare never pushed. - No authentication at the network layer. Same trust model as Ocular
— deploy it behind your own reverse proxy with SSO, a VPN, or a
Cloudflare Tunnel + Access app (see
deployment.md§7).
Shared-GPU contention with your app
Console has two ingest paths — only one of them contends with your app for GPU:
- Conduit push (
log: truefrom Ocular). Your app calls/classify, Ocular scores on the GPU, then pushes the already-scored result to Console. No re-scoring on Console's side, so this path doesn't use the GPU at all. Runs freely alongside your app's inline traffic. - Direct
/api/ingestwithout a pre-scored body. When the caller postsmessages[]but noocularbody (sandbox testing, log backfill against unscored transcripts), Console calls Ocular itself to score. This path is what contends with your app's GPU.
The direct-ingest path runs in trajectory mode (per-turn) and holds the
GPU serially; your app's inline /classify calls batch-coalesce
separately. They can't interleave — every in-flight trajectory ingest
blocks the batch queue from firing until it finishes.
Empirical impact on a 24 GB datacenter-class GPU: two concurrent direct ingests running against the same box roughly halve your app's inline scoring throughput and roughly double its p50 latency while the ingests are in flight. Larger cards see a smaller relative effect; the shape is the same.
Three mitigations for the direct-ingest case, least to most change:
- Lower direct-ingest concurrency. If you're driving unscored ingest through a batch job or backfill, pace it at 1 concurrent request instead of multiple. Trades drain time for app-latency stability.
- Split Console onto its own box. Ocular on box A serves your app;
Ocular on box B serves Console's direct ingest. Console (CPU only)
lives on either box or a third. See
integration-patterns.md§"Pattern 6 — Fleet (scaled)" for the shape. - Delegate Console's scoring to a remote endpoint. Point Console's
OCULAR_URLenv var at a remote Ocular endpoint (a separate deployment, or one you host elsewhere) instead of the localhttp://ocular:8080/classify. Your on-prem GPU is reserved for app traffic; Console's direct-ingest scoring goes elsewhere at remote-call latency.
If Console only receives conduit pushes (no direct-ingest load), none of the above applies — conduit ingests never hit the GPU.
Starting Console
Console is included in the platform tarball and runs with the same compose
file as Ocular, under a console profile:
docker compose -f customer-compose.yml --profile console up -dSkip --profile console to run Ocular without Console.
Console listens on port 3950 by default (change by editing
customer-compose.yml). Sanity check:
curl -fsS http://localhost:3950/api/health | jq .
# → {"status":"ok","sessions":0}
curl -fsS 'http://localhost:3950/api/health?deep=true' | jq .
# → {"status":"ok","db":"ok","ocular":"ok","sessions":0}The deep=true variant also verifies Console can talk to Ocular — it's a
good first-deploy sanity check.
The UI
Top-level pages (all at http://<console-host>:3950/):
| Path | Purpose |
|---|---|
/ |
Dashboard — recent sessions, current crisis counts, quick filters. |
/sessions |
All sessions, sortable + filterable by crisis level, user, agent, date. |
/users |
Users who've produced sessions. Drill into per-user history. |
/users/[id] |
Single user's session history + crisis trajectory over time. |
/agents |
AI agents (by agent_id). |
/agents/[id] |
Single agent's behaviour metrics. Useful for "is this agent drifting?" |
/watchlists |
Configure watchlist rules + webhooks. |
/logs |
Audit log of watchlist matches and webhook deliveries. |
/performance |
Throughput + queue depth + recent latency histograms. |
/settings |
Retention window, webhook defaults, CSV export. |
/diagnostics |
Package Console's scored data into an NDJSON snapshot for NOPE to analyse. Full write-up in diagnostics.md. |
Keyboard shortcut: ? on any page shows keybindings.
Getting sessions into Console
Two ways.
Conduit push (the primary path)
Your app calls Ocular's /classify with session_id, user_id, and
log: true (plus optionally agent_id and messages):
{
"messages": [{"role": "user", "content": "I don't know how much longer..."}],
"session_id": "conv-42",
"user_id": "u-1234",
"agent_id": "bot-main",
"log": true
}Ocular scores the message and fires-and-forgets a POST to Console's
/api/ingest with the scored result (plus the messages, if you provided
them — Console uses them to render transcripts). The session appears in
Console within a few hundred milliseconds.
The push is gated on log: true — calls without it are never pushed,
regardless of whether session_id + user_id are present. This is
opt-in by design: production traffic at scale is unlikely to want
every scored call mirrored into Console.
On agent_id. Optional. Include it if you want the session to
appear under /agents/<agent_id> in Console — that's how the agent
drill-down, per-agent metrics, and agent-scoped watchlists find the
session. Sessions without agent_id are still logged and searchable
but don't participate in agent-breakdown views.
If OCULAR_CONSOLE_URL is unset on the Ocular container or Console is
down, the push is silently skipped. Ocular still returns the score to
your caller — the conduit is a side-effect, not a dependency.
Direct POST to /api/ingest
You can also ingest externally-scored sessions directly — useful for backfilling from logs:
curl -s -X POST http://localhost:3950/api/ingest \
-H 'Content-Type: application/json' \
-d '{
"session_id": "external-1",
"user_id": "u-9",
"agent_id": "manual",
"messages": [{"role": "user", "content": "..."}],
"ocular": { ... full /classify v1 response shape ... }
}'Or, to let Console call Ocular itself (useful for sandbox testing —
Console will score the messages through its configured OCULAR_URL):
curl -s -X POST http://localhost:3950/api/ingest \
-H 'Content-Type: application/json' \
-d '{
"session_id": "external-2",
"user_id": "u-9",
"messages": [{"role": "user", "content": "..."}]
}'Useful for backfilling from logs, or if you're scoring with a model other than Ocular but want to inspect results in the Console UI.
Concurrent /api/ingest calls are coalesced into one fsync per event-loop
tick by an in-process group-commit coordinator, so you don't need to batch
yourself to get good throughput. If you sustainedly exceed the queue's
high watermark Console returns 429 Too Many Requests with a Retry-After
header — honor it (or switch to /api/ingest/batch below) instead of
hot-looping the retry.
Batch ingest (POST /api/ingest/batch)
For high-volume backfill or replay where you can buffer sessions client- side, post up to 500 pre-scored sessions per call. They're written in a single SQLite transaction; per-item failures are isolated via SAVEPOINT and reported in the response array — one bad payload doesn't fail the batch.
curl -s -X POST http://localhost:3950/api/ingest/batch \
-H 'Content-Type: application/json' \
-d '{
"sessions": [
{"session_id": "s1", "user_id": "u1", "ocular": { ... }},
{"session_id": "s2", "user_id": "u2", "ocular": { ... }},
...
]
}'
# → {"count": 2, "succeeded": 2, "failed": 0, "results": [{"ok": true, ...}, ...]}Pre-scored only — every entry must include ocular. See
deployment.md § "Console at scale" for tunables and sizing.
Session detail view
Clicking a session shows:
- Header. Verdict, subject, imminence, timestamp, user/agent IDs.
- Risk breakdown. Per-axis scores (
risks.suicide,risks.harm_to_others, …,ai_concerns.safeguarding_failure, etc.) with level labels drawn from the 5-value domain (minimal→critical). - Signals. The ranked, screening-filtered
signals[]list — opaquesignal_NNNNidentifiers with calibrated scores. - Trajectory. If the session was scored with
per_turn: true, a per-turn plot showing how each risk axis evolved across the conversation, with the per-turnverdicttransitions highlighted. The caller setsper_turn: trueon the original/classifycall — the conduit push forwards whatever the caller requested; withoutper_turn, Console has no trajectory to render. - Turns. The conversation itself, with each turn's individual signal summary inline (if trajectory data was included).
- Provider content. If the caller pushed
messagesalongside the scored result (i.e. used the conduit push withlog: true), Console renders transcripts inline from its own DB. Otherwise, if you configuredPROVIDER_URL=http://your-app/...incustomer.env, Console will fetch the full message content from your app's transcript service for display. Without either, the turn list shows "no content available" — Ocular doesn't store raw message text.
Console's JSON API
Programmatic consumers (verification scripts, dashboards, your own eval pipelines) can read Console's stored sessions without the browser UI.
GET /api/sessions/<session_id>
Full detail for one session.
{
"session_id": "...",
"user_id": "...",
"agent_id": "...",
"scored_at": 1776327408,
"ingested_at": 1776327408,
"message_count": 90,
"turns": [
{"turn": 0, "role": "user", "content": "..."},
{"turn": 1, "role": "assistant", "content": "..."}
],
"ocular": { /* the verbatim /classify response — see api-reference.md */ }
}ocular is the full /classify body; drill in via ocular.verdict,
ocular.risks.<axis>.score, ocular.fiction, etc. Returns 404 if
session_id is unknown.
GET /api/sessions
Lists stored sessions. Response: { total, sessions: [ ... ] } where each
session is the same shape as the detail endpoint.
| Query param | Values | Default |
|---|---|---|
verdict |
clear | watch | danger |
(no filter) |
min_crisis |
0..1 |
0 |
sort |
promoted column name (e.g. crisis_score, scored_at, suicide_risk, max_user_risk, fiction_strength) |
crisis_score |
order |
asc | desc |
desc |
limit |
≤ 200 | 50 |
offset |
≥ 0 | 0 |
DELETE /api/sessions/<session_id>
Removes a session row. Retention is usually driven by RETENTION_DAYS;
DELETE is for ad-hoc cleanup.
Watchlists
A watchlist is a rule + a destination. When a matching session comes in, Console records a "match" and, if configured, fires a webhook.
Anatomy of a rule
{
"id": "high-confidence-suicide",
"name": "Suicide risk (high+) with corroboration",
"scope": "session",
"conditions": {
"requiresAll": [
{"code": "risks.suicide.level", "operator": "in", "value": ["high", "critical"]},
{"code": "risks.suicide.corroboration.level", "operator": "!=", "value": "absent"}
],
"requiresAny": [
{"code": "verdict", "operator": "==", "value": "danger"}
]
},
"webhookUrl": "https://alerts.example.com/hook/foo",
"timeWindow": 86400,
"enabled": true
}A rule is a conditions block (with requiresAll[] + requiresAny[] arrays)
against the canonical field surface from the Ocular v1 response. A rule fires
when every condition in requiresAll matches AND (when requiresAny is
non-empty) at least one condition in requiresAny matches. A rule with
both arrays empty is rejected at write-time — it would match every session.
Naming convention. Rule metadata keys (
webhookUrl,timeWindow,scope,enabled) are camelCase. Conditioncodepaths and the webhook body payload (session_id,crisis_score,matched_scores) are snake_case — they mirror the Ocular response surface directly.
Condition shape
Each condition is {code, operator, value}:
code— a namespaced path into the Ocular response. Enumerated, not free-form. AskGET /api/watchlists?format=exportfor the full list; a selection:Path Type Values verdictenum clear|watch|dangersubjectenum self|other|unknownrisks.<axis>.scorenumber 0..1. Axes: suicide,self_harm,harm_to_others,abuse,sexual_violence,exploitation,stalking,self_neglectrisks.<axis>.levelenum minimal|low|moderate|high|criticalai_concerns.<axis>.scorenumber 0..1. Axes: harm_provision,emotional_failure,manipulation,safeguarding_failureai_concerns.<axis>.levelenum same 5-value domain imminence.score/imminence.levelnumber / enum same fiction/authenticitynumber 0..1 risks.<axis>.corroboration.levelenum absent|limited|moderate|strong(4 user axes carry corroboration:suicide,self_harm,harm_to_others,abuse)
Path naming note. In the
/classifyresponse body, corroboration lives atdetail.corroboration.<axis>.strength(key isstrength). The watchlist schema exposes the same data asrisks.<axis>.corroboration.level— a flattened path withlevelas the key. Watchlist paths are a Console-side naming vocabulary, not a direct mirror of the Ocular response shape.
operator— for numeric paths:>=,>,<=,<. For enum paths:==,!=,in,not in.value— a number (numeric ops), a string (==/!=), or an array of strings (in/not in). Enum values are validated against the field's domain at rule-write time.
Scopes and dedup
session— evaluate once per (watchlist, session).message— evaluate once per turn intrajectory[]. Onlyverdictvaries per-turn; everything else resolves to the session-level value.user— evaluate once per (watchlist, user) withintimeWindowseconds. Suppresses repeat pages for the same person; the first match in the window wins.
What's not supported (by design)
- Raw head-score conditions. Watchlists subscribe to outcomes, not
to model internals. Conditions like
code: "USER_SUICIDAL_IDEATION"orcode: "signal_0042"are rejected at write-time. If you need per-head rules, requestdetail: trueon/classifyand run your own logic in your app — don't push head semantics into the alerting layer.
Edit watchlists in the UI at /watchlists, or drive the /api/watchlists
endpoint directly (required when provisioning rules from CI). All
mutating verbs require CONSOLE_MODE=full.
| Verb | Path | Purpose |
|---|---|---|
GET |
/api/watchlists |
List active watchlists. |
GET |
/api/watchlists?format=export |
Portable export: canonical fields[] list + operator/domain catalog + all rules. Use this to version-control rules or feed your own engine. |
POST |
/api/watchlists |
Upsert a single rule. Body: {id, name, scope, conditions, webhookUrl?, enabled?, timeWindow?}. |
PUT |
/api/watchlists |
Bulk import. Body: {watchlists: [...]}. Validates all rows up-front; rejects the whole batch on any error so partial imports don't silently drop rules. |
PATCH |
/api/watchlists |
Toggle enabled. Body: {id, enabled: boolean}. |
DELETE |
/api/watchlists |
Remove a rule. Body: {id}. |
Webhook delivery
When a session matches, Console POSTs to webhook_url with:
{
"event": "watchlist_match",
"timestamp": "2026-04-19T12:34:56.789Z",
"watchlist": "High-risk user watch",
"session_id": "conv-42",
"user_id": "u-123",
"agent_id": "bot-main",
"crisis_score": 0.847,
"matched_scores": { "...": "per-axis scores at the time of the match" }
}crisis_score is Console's internal proxy for "how bad is this?" — computed
from the max user-side risk score. matched_scores carries the per-axis
snapshot (risks.suicide.score, etc.) used by the rule at match time, so
the receiving side has enough context to act without calling back to
Ocular.
Webhooks are delivered from an in-process outbox with at-least-once semantics: if your endpoint returns non-2xx, Console retries with exponential backoff (up to ~1 hour before giving up). Duplicate matches within ~60 seconds for the same session + watchlist are deduplicated.
Exporting watchlists for your own pipeline
Console is great for prototyping, but at scale you probably want to run
rules in your own infrastructure. GET /api/watchlists?format=export
returns a portable JSON representation:
curl 'http://localhost:3950/api/watchlists?format=export' > watchlists.jsonThis can be version-controlled and evaluated in your own rules engine. Console remains useful for session inspection even after you move rule evaluation out.
Retention
Console is not a system of record. Two retention axes apply; whichever evicts first wins.
- Time (
RETENTION_DAYS, default 7). Hourly sweep deletes anything older than the window — sessions, audit log, watchlist matches, delivered webhook outbox. - Size (
RETENTION_MAX_GB, default unlimited). 60s watchdog evicts oldest sessions FIFO when the.db+ WAL file exceeds the cap, then truncates the WAL and runsincremental_vacuumso the file actually shrinks on disk. Use this as a hard stop below your volume's quota so Console ages out old data instead of hittingdisk fulland losing writes.
Both are configurable at runtime through Settings → System or
PATCH /api/settings ({retentionDays, retentionMaxGB}); the DB-row
override sticks across restarts.
If you want a durable record of every scoring event, read the /classify
response in your own app and persist it however you normally persist
application data. Don't treat Console as a warehouse.
When to skip Console entirely
Run Ocular without Console if:
- You already have a dashboard or pipeline you want to feed scoring signals into.
- Your integration is fully programmatic — your app's code calls
/classify, stores results, and handles alerting. - You want the minimum possible attack surface on the deploy box.
In that case: skip --profile console when running compose. The Console
image is still in the tarball but nothing starts. Zero cost.