Inside the ProctorSafe SDK: How On-Device Analysis Actually Works

The problem with cloud proctoring

Traditional online proctoring sends video and audio streams to a cloud backend for processing. Even before AI analysis, this creates three structural problems:

Latency. A 30fps video stream, compressed, is still significant bandwidth. For students in low-connectivity environments, this is a fairness issue — their sessions may degrade while others' don't.
Privacy surface. Raw video traverses the network and is stored, even briefly, on third-party servers. Any interception or breach exposes the candidate's home environment, appearance, and behavior in real time.
Processing bottleneck. Cloud processing means the integrity signal (the flagging of anomalous behavior) is always asynchronous — it happens after the session, during review. This delays incident response and limits the system's ability to intervene in real time.

ProctorSafe takes a different approach. The SDK runs entirely in the candidate's browser.

Loading: a 6–10 second bootstrap

When a candidate opens an exam session, the ProctorSafe SDK loads as a WebAssembly module. The current bundle is approximately 1.8MB — comparable to a medium-sized JavaScript framework. In practice, load times sit between 6 and 10 seconds on a typical connection.

The bootstrap sequence:

The LMS launches the session via LTI 1.3. The candidate is authenticated against the institution's identity provider.
The candidate's browser receives a lightweight configuration payload: exam duration, allowed browser settings, and a session key.
The SDK initialises, establishes a secure session with the proctoring backend, and begins signal collection.

No plugins. No downloads. No extensions. The candidate's browser is the entire client.

What the SDK collects

The SDK does not capture video, audio, or images. Instead, it observes browser-level events and device behavior that are relevant to exam integrity:

Keyboard and mouse interaction patterns — not raw keystrokes, but behavioral signals: typing rhythm, pause frequency, interaction density.
Window and tab focus events — how often the candidate switches away from the exam window, and for how long.
Browser fullscreen state — whether the exam window is in focus.
DevTools and developer console detection — whether the candidate has opened browser developer tools (often a signal of attempt to access external resources).

These are behavioral signals, not content. The SDK never sees what the candidate types, reads, or looks at. It only observes how they interact with the exam environment.

The event log and HMAC signing

Every ~500ms, the SDK evaluates the current signal state and makes a decision: emit a log entry, or remain silent. If anomalous behavior is detected (rapid tab switches, focus loss, pattern anomalies), a structured event is logged.

Each event has the following structure:

{
  "timestamp": 1750192800000,
  "event_type": "focus_loss",
  "trust_delta": -0.05,
  "session_id": "sess_a3f9c...",
  "signature": "hmac-sha256:abc123..."
}

The signature field is the critical part. Each event is signed with a session-specific HMAC key that is generated at session start and never transmitted to the server in the clear. The key is held in the browser's secure memory context for the duration of the session.

This means: if an event log is intercepted or the server is compromised, the attacker cannot forge retroactive events. The integrity of the event chain is cryptographically verifiable without requiring a trusted server.

Transmission: small, batched, encrypted

Events are buffered client-side and transmitted in batches every 30 seconds via HTTPS. The payload is encrypted end-to-end before transmission. No video. No audio. No images. No biometric vectors.

The server receives: a list of timestamped, signed events and an aggregate trust score. That is all.

The trust score system

On the server side, events are processed and a real-time trust score is computed. The score starts at 1.0 and is adjusted by event deltas. High-frequency focus losses, rapid tab switches, and detected developer tool usage each carry negative weight.

The trust score is what the proctor or instructor sees in the review dashboard. A score below a configurable threshold triggers a flag for human review. The raw events are available for inspection, but the default view is the score — clean, explainable, and proportionate.

Privacy by architecture, not by policy

The distinction matters. Most proctoring platforms have a privacy policy that says video is not retained beyond a certain period. But the data is still collected, still transmitted, still stored — even if temporarily. Privacy is promised, not enforced.

With ProctorSafe, the architecture enforces privacy. The data the policy describes — video, audio, face scans — does not exist on any server. It was never collected. This is privacy by default under GDPR Article 25, not privacy by promise.

The result: institutions using ProctorSafe are not managing a tradeoff between integrity and privacy. They're running an integrity system that generates a minimal, purpose-limited data trail by design.