working lab artifact trust architecture OpenAI logprobs batch-invariant proof engine

Trust architecture, reduced to a working instrument.

Most AI demos show the answer. This one shows the release decision around the answer: provenance, confidence, consistency, attribution, and the controls that decide whether a human sees it first.

It is not a project list. It is the kind of artifact I build when a leadership room asks whether their model output is safe enough to ship.

layers
GRAIN · GRIP · KNOT · VEIN
signals
token logprobs, repeatability, expert review, cold-start determinism
mode
OpenAI fast path, server-side key, dormant H100 proof engine
observatory preview release gate
GRAIN
GRIP
KNOT
VEIN
Receipt

Confidence is the weak layer. The answer may be right, but the release gate holds it until a human checks the uncertain claim.


/LAB Live instrument · provider-selectable

Inspect one answer before it gets trusted.

Choose a prompt, pick OpenAI or HuggingFace, adjust the gate, and run the inspection. The answer comes from a real LLM call through the Raising Agents serverless endpoint. OpenAI is the instant logprob path; HuggingFace is the full-access trace path that can expose internal signals when the RunPod worker is configured.

Ready. Inspect calls /api/trust with provider keys kept server-side.

Decision
Review required
0.71 weighted trust score
Answer under inspection preview trace · click Inspect for live output

Signals surfaced highest-risk signals first
Inspection trace what changed the decision
    Expert Agent review expert explainer waits for live run

    Architecture review pending

    Run an inspection to ask the LLM-based expert to explain the result and the Trust Architecture behind it.

    GRAIN checks provenance. GRIP checks confidence. KNOT checks consistency. VEIN checks attribution. The gate decides release, review, or block.


    /R Repeatability audit · live capped run

    One answer is not a reliability signal.

    The original demo had a full deterministic AI surface. This page turns that into a live capped run: same prompt, several calls, hashes grouped by output. The point is operational, not theatrical.

    Controlled decoding should converge. Baseline sampling often does not. If the answer changes, the organization needs a gate. Runs are capped server-side to keep cost and abuse bounded.

    Run audit

    Repeated live calls

    preview mode click Run live for real calls

    /REF Architecture reference

    Four questions before an answer becomes an action.

    Trust Architecture is a release system for model output. It does not claim the model is truthful. It asks whether enough evidence exists to let the answer pass without intervention.

    Input Prompt Evidence Policy
    Model LLM call Logprobs Trace
    Signals 21 readings 6 families
    Layers GRAIN GRIP KNOT VEIN
    Decision Release Review Block
    GRAIN

    Provenance

    Where did this come from? Source count, age, chain depth, and source health decide whether the answer has a floor.

    GRIP

    Confidence

    How certain should we be? Entropy, logprob, calibration, hidden-state drift, and late-layer override catch confident wrongness.

    KNOT

    Consistency

    Does it stay stable? Repeated calls and branch agreement reveal whether the answer survives rerun pressure.

    VEIN

    Attribution

    Is each claim traceable? Claim support, coverage, citation density, and unsupported claims turn prose into an audit object.

    Provider capability matters.

    Some providers expose logprobs or top-k data. Some do not. Local Hugging Face backends expose deeper traces. A serious trust surface says which signals are actually measured.

    Provider logprobs hidden attention repeatable
    openai limited no no weak
    anthropic no no no weak
    huggingface full yes yes strong
    01

    Information theory

    Entropy, NLL, token margin, confidence calibration, and variance read the model's probability surface when the provider exposes it.

    02

    Behavioral checks

    Self-consistency and semantic equivalence ask whether repeated generations preserve the same operational answer.

    03

    Mechanistic trace

    HuggingFace full access adds attention entropy, representational drift, cross-layer consistency, and logit-lens convergence.

    04

    Geometry

    Reasoning smoothness and curvature anomalies look for jumps in the answer's semantic trajectory.

    05

    Ensemble gate

    The final decision combines available signals, layer weights, and release thresholds into release, review, or block.


    Course connection

    This is the course thesis in artifact form.

    AI Transformation Leads do not need more positioning language. They need instruments: inventory scanners, trust gates, repeatability harnesses, and the judgment to explain the result to leadership.