Skip to main content

Interpreting risk scores

Ocular returns a lot of fields per call. This document explains how they compose and, more importantly, which ones to use for which decision. If you're only going to read one page of the docs, this is the one.


The short version

  • Use salience (a continuous score in [0, 1]) for gating decisions. It's the authoritative aggregate — fusion-layer output, fiction-aware, corroboration-aware. Pick the cutoff that fits your downstream action.
  • Published reference band cuts: T_WATCH=0.30, T_DANGER=0.60. These are the defaults that match the band display in the NOPE dashboard — useful starting points, not a contract. Tune your own.
  • Use signals.user.<axis>.level and signals.ai.<axis>.level ("minimal" ... "critical") for per-axis UI labels and dashboards.
  • Use heads[] for "what fired" explanation UIs — already filtered to Ocular's screening threshold, sorted by severity.
  • Don't threshold on per-axis score yourself. heads[] is already threshold-filtered, and salience already knows about context. The raw per-axis magnitude is a sort key, not a boolean.
  • Don't reach into detail.scores[code] as a decision surface. Those per-head values saturate on short inputs, fire on fiction as easily as on crisis, and don't reason about speaker attribution. The top-level fields are Ocular's interpretation of the raw values — trust them over the raw values.

The hierarchy

Ocular produces three layers of output. Each layer builds on the ones below:

   Layer 3:  salience, signals.user.*.level, signals.ai.*.level, heads[], imminence.level
             ↑
             Context-aware fusion (fiction, corroboration, attribution)
             ↑
   Layer 2:  signals.user.suicide.score, signals.ai.safeguarding_failure.score, ...
             ↑
             Axis aggregation
             ↑
   Layer 1:  detail.scores[USER_SUICIDE_HEAD_A], ... (detail=true)
             ↑
             Raw per-head probabilities

Higher layers reason about context: fiction framing, cross-head corroboration, speaker attribution. Lower layers don't. Most customers should consume Layer 3 only. Layer 2 is for dashboards. Layer 1 (only visible via detail=true) is for audit trails and diagnostics via support.


salience — the authoritative classification

A continuous score in [0, 1]. This is the field your rules engine keys off. Higher means more concerning. salience describes what Ocular classified; the policy you attach to each cutoff (alerts, UI treatments, gating) is yours to decide.

Reference band cuts

Band Range What Ocular classified
Danger salience ≥ 0.60 At least one user-side or AI-side axis at high or critical severity. Signals passed Ocular's fiction-aware severity thresholds.
Watch 0.30 ≤ salience < 0.60 At least one axis at moderate severity, below danger thresholds.
Clear salience < 0.30 No axis at moderate or above, OR signals were descoped by fiction framing.

The cuts (T_WATCH=0.30, T_DANGER=0.60) are the published references — they're what the NOPE dashboard and Console use as the default band view. They're guidance, not a contract. Customers commonly tune their own cutoffs against their own labelled data; see Tuning against your own baseline below.

Salience is derived by Ocular's fusion layer from:

  • Direct indicators on each risk axis (suicide, self-harm, harm to others, abuse, etc.).
  • Fiction framing (is this a roleplay?) — see fiction + authenticity scalars.
  • Subject attribution (is the speaker talking about themselves, or reporting someone else?)
  • Cross-signal corroboration (multiple orthogonal indicators co-firing weigh more than any one alone).
  • Whether the signal is a single mention or woven through the conversation.

Why continuous? A single scalar lets each consumer choose the operating point that fits their downstream action — a high-recall crisis-line routing might trigger at salience ≥ 0.20, a precision- sensitive moderation queue might wait for salience ≥ 0.80. Below-threshold-but-present signals and fiction-descoped signals both live in the lower range without losing information: the fiction scalar reports "we saw signals but descoped them" and per-axis scores still carry the magnitudes so you can drill in for "why."


Per-axis UI: signals.user.<axis>.level and signals.ai.<axis>.level

Each of the eight user-side axes (and the four AI-side axes) carries a level label alongside its numeric score. The level domain is:

Value Score range
minimal < 0.05
low [0.05, 0.12)
moderate [0.12, 0.25)
high [0.25, 0.45)
critical ≥ 0.45

Use these for colour-coded dashboards or per-axis filtering. They're thresholded from the raw score, so they can update smoothly as scores change.

Levels describe score magnitude, not clinical severity. "critical" is the top bucket of Ocular's score range (≥ 0.45) — not a clinical assessment that the speaker is in a critical condition. Treat the labels as magnitude buckets for UI colouring and filtering.

Important: a single axis showing high or critical does NOT by itself mean salience will cross the danger threshold. Salience accounts for context — fiction framing, subject attribution, cross- signal corroboration — that raw axis scores don't. An RP scene depicting self-harm can light up signals.user.self_harm.level="high" while salience correctly stays low. Defer to salience for gating; use axis levels for explanation and drill-down.


The 8 user-side axes

Each axis appears under signals.user with a {level, score} pair.

Axis naming convention. These names describe the signal category Ocular detects in conversational content — the kinds of linguistic patterns that co-occur with each risk domain. They are not clinical assessments of the speaker's condition. signals.user.suicide.score = 0.73 does not mean "the speaker is at 73% suicide risk"; it means linguistic markers in Ocular's suicide-related category fired at that strength.

Axis (signals.user.<key>) What it detects
suicide Linguistic markers associated with suicide-related content — expressions of ideation, plan, intent, or capability.
self_harm Self-injury markers (cutting, burning, etc.). Distinct from suicide.
harm_to_others Markers of intent or ideation to harm specific others.
abuse Abuse disclosure signals (domestic violence, coercive control, financial, sexual).
sexual_violence Disclosure signals of sexual assault or coercion.
exploitation Trafficking, grooming, child-exploitation indicators.
stalking Markers of stalking victimisation or perpetration in the conversation.
self_neglect Markers of impaired functioning affecting safety (hygiene, nutrition, medication non-adherence).

Each level field is one of "minimal", "low", "moderate", "high", "critical".

How to use them: build per-axis dashboards if you care about the breakdown; otherwise salience summarises across all 8.

Console note. All 8 axes are always present in the /classify response and are stored in Console's session record. Console's operational surfaces — watchlist conditions, /sessions sort options, agent-breakdown metrics — currently cover the first 4 (suicide, self_harm, harm_to_others, abuse). The other 4 are visible on session detail but not yet directly actionable from Console.


The 4 AI-behaviour axes

These fire when the assistant turns in the conversation are problematic, regardless of the user's state. They appear under signals.ai, separate from the user-side axes.

Axis (signals.ai.<key>) What it detects
harm_provision Assistant provided instructions, encouragement, or validation for harm (to self or others).
emotional_failure Assistant missed clear emotional signals; responded with irrelevant/mechanical output.
manipulation Assistant used manipulation tactics — guilt, shame induction, love-bombing, persistent boundary pressure.
safeguarding_failure Absence of boundary-setting or redirect behavior from the assistant in the presence of user-side distress signals.

Use these to evaluate your own AI product's behaviour. Note they can fire on the assistant turns from your app or any third-party model you're inspecting.


imminence

How acute is the situation? imminence is an object with level and score:

"imminence": {"level": "high", "score": 0.42}

Same level domain as the per-axis entries (minimal/low/moderate/high/critical). "high" indicates linguistic markers (plan, means, timeline, preparatory language) that in Ocular's training data co-occurred with near-term acuity. It's a text-pattern signal, not a predictive clinical assessment.

Imminence reflects near-term acuity for suicide / harm_to_others only — there's a single top-level imminence object, not a per-axis one. The other axes (abuse, exploitation, stalking, self_neglect, etc.) are chronic-pattern signals — there's no equivalent "about to happen in the next hours" marker for them. On sessions where suicide / HtO aren't the active axes, expect imminence.level: "minimal" and imminence.score: 0.


Fiction gating

Ocular soft-suppresses salience when the conversation reads as fiction or roleplay. Two top-level scalars report this:

  • fiction (0..1): how much the conversation reads as fiction/roleplay. High fiction with no corroborating distress signals won't lift salience above the watch threshold.
  • authenticity (0..1): counter-signal. Markers of register-authentic distress (direct appeals, frame breaks out of RP, out-of-character meta-comments).

Both are always present. Fiction gating is soft — it modulates thresholds continuously, not as a hard on/off. A fiction scene with a genuine-distress break in it (high authenticity despite high fiction) can still push salience into the watch band. This is deliberate: some users use roleplay as a way to approach real distress indirectly, and we don't want to miss those.


subject attribution

One of:

Value Meaning
self The speaker is describing their own situation.
other The speaker is reporting someone else's situation (third-party disclosure).
unknown Ambiguous — could be either.

Crisis-axis salience contributions are gated on subject == "self". A user reporting "my friend said they want to die" will show some signal on the suicide axis but won't lift overall salience toward danger, because the person at risk is not the speaker. Third-party disclosures still fire signals.user.abuse, signals.user.exploitation, etc. — the speaker may be reporting a victim.


salience shape table

The rough shape of a session that lands in each reference band (using the published T_WATCH=0.30 / T_DANGER=0.60 cuts):

Rough shape Lands in band
Self-attributed suicide signals with corroboration, low fiction, and imminence markers Danger (salience ≥ 0.60)
Self-attributed suicide signals with lower confidence or under fiction framing Watch (0.30 ≤ salience < 0.60)
Moderate signal on any self-attributed user-risk axis Watch (at minimum)
AI-side axis firing in the presence of user-side signal Watch
Minor-involving safeguarding failure Danger (not fiction-descoped)
Third-party disclosure (subject == "other") on abuse/exploitation Watch (reporting a victim)
No above-threshold signals Clear (salience < 0.30)

Exact per-axis thresholds inside the fusion layer are internal and fiction-modulated. Use salience as the interface rather than deriving your own bands from per-axis scores.


What Ocular deliberately won't claim

  • Not predictive. Scores describe what's in the conversation, not what will happen.
  • Not diagnostic. Ocular doesn't diagnose mental illness, substance-use disorder, abusive relationships, or anything else.
  • Not therapeutic. Ocular output is classification, not intervention.
  • Not a replacement for clinical judgment. Humans make the call.

Tuning against your own baseline

The _level bucket thresholds are calibrated against a general mix of conversational data. If your application has a specific register (e.g. purely companion chat, purely technical support, medical triage, etc.), the absolute scores may not map cleanly.

Recommended tuning workflow:

  1. Collect 200-500 conversations representative of your app. Hand-label each with a simple "did this need intervention?" binary.
  2. Run each through /classify and record salience, signals.user.suicide.score, heads[], fiction, authenticity.
  3. Compute confusion matrices against your labels. At the published T_WATCH=0.30 and T_DANGER=0.60 cuts, where is Ocular firing that you'd consider overblown? Where is it silent on cases you'd act on?
  4. Pick your own salience cutoff to match your operational bar — a high-recall product might trigger at salience ≥ 0.20; a precision- sensitive workflow might wait for ≥ 0.80. Layer per-axis rules on top if you need axis-specific routing.

A pilot of 200 cases is usually enough to pick a sensible cutoff. 500+ is enough to start trusting per-axis tuning.