Back to Northern Inference

Security and trust

Sovereignty you can audit. Cost transparency you can recompute. A public log of what we deliberately chose not to build.

Per-request enforcement 5-source pricing reconciliation Deferred-decision log

Sovereignty enforced per request, not just on the brochure

Most "Canadian AI" claims are a logo and a vague promise about where the servers live. Ours is enforced on every single API call. Three things make that possible:

1. A signed chain of custody on every response

Every request returns headers that record what region it traversed, what jurisdiction's law applied, and which provider handled it. Your own SDK can log them, your auditor can verify them. No "trust us" required.

X-NI-Custody-Path: NI-CA → Bedrock-CA

X-NI-Resolved-Jurisdiction: CA

X-NI-Resolved-Region: ca-central-1

X-NI-Resolved-Provider: bedrock

X-NI-Credential-Source: platform

If a request ever resolved to a different region than the deployment claimed, the headers would say so. There is nowhere for sovereignty to hide.

2. Real-time jurisdiction drift detection on Vertex routes

For Vertex AI routes, before each call dispatches the gateway resolves the upstream hostname's IPs and checks them against Google Cloud's own published prefix range for that region. If the resolved IPs don't belong to the expected jurisdiction's range, we record a drift event and (in enforce mode) reject the request with HTTP 403 before any data leaves our infrastructure. Bedrock and Azure routes call a fixed regional endpoint and are protected by the fail-closed residency guard below.

The source of truth is Google Cloud's public IP-range catalog, refreshed daily. Our enforcement runs in 5-15 ms typical, in-process. Drift events are visible in the operator's admin UI as a live count and click-through table.

3. Residency that fails closed, not open

A deployment tagged for Canadian residency is enforced at the routing layer. If the underlying region or SKU resolves to a different jurisdiction, the deployment is silently REFUSED from serving traffic rather than quietly routing to the wrong region. Operators see the mismatch in an admin page; customers see a clear error pointing at a known-good alternative.

The trio works together. Tagging alone wouldn't be enough: DNS could lie, region labels could drift, SKU semantics could change. By enforcing all three at request time and recording the result in the response, customers can prove the contract was met for any specific call, not just on average.

Cost transparency you can recompute

"0% markup" is hand-waving in the LLM space. Ours is auditable. Every single request stamps two cost components into your portal: what the upstream provider charged us, and what Northern Inference charged you. Sum either column. They reconcile.

Pricing cross-validated across multiple independent sources

Our model pricing is cross-validated against multiple independent sources, including the providers' own pricing APIs (AWS Pricing API, Azure Retail Prices) and independent pricing catalogs. They agree on most rows; when they don't, our hourly sanity worker flags the divergence and emails the operator BEFORE customers see a wrong invoice. If a stored rate diverges by more than 2× from any authoritative source, an alert fires.

Self-policing reconciliation

At the end of every billing cycle, an automated job joins our request-log charges against authoritative rates and looks for systematic under-billing: periods where our rates were wrong and we ate the difference. The findings live in an admin report. Stefan reviews them. Customers don't have to.

Pre-deploy sanity gate

When a new model rate is added, the gate refuses anything more than 2× off comparable rows. This catches accidental overrides at the moment they'd cause damage rather than at end-of-month.

What you can verify yourself. Pull your request_logs for any period via the portal. SUM the provider_cost_cents column: that's the wholesale cost. SUM the ni_fee_cents column: that's our charge. Add them. They match the line item on your statement to the cent.

An honest log of what we chose not to build

Most pitches in our space claim coverage of every threat model. We document our refusals out loud, with the reasoning. Reading our deferred-decision log is itself a trust signal.

An example. Our jurisdiction drift detection has two enforcement vectors: pre-call DNS+IP-geo verification (shipped) and HTTP 30x redirect interception during the upstream call (deferred by decision). The redirect interception addresses a contrived threat model. Vertex's regional load balancer uses Anycast origin failover, not HTTP redirects. Bedrock's control plane uses regional endpoints with no cross-region redirects. Azure Cognitive Services the same. We had no realistic way to test the deferred half against actual drift, and implementing it would have required deeper access to a black-box dependency than we'd want to take on for a threat that doesn't exist in production. So we shipped the half that maps to the real failure mode (DNS pointing at unexpected IPs after a regional reroute) and wrote down the choice. If a customer audit asks for in-flight redirect interception, or if we ever observe a real redirect-based drift event in the wild, we'll reopen.

The full deferred-decision log lives in our internal docs. We will publish a public summary when we reach the SOC 2 audit gate; until then, this and other examples will surface here as they come up.

What an auditor or buyer can ask for today

Per-request custody headers. Send a test request, read the headers, verify the jurisdiction matches your contract.
Live drift count. The admin Models page shows current Vertex jurisdiction-drift events in the last 24 hours. Zero in normal operation.
Cost reconciliation rolls. The admin Operations page shows hourly pricing-sanity findings (rates flagged as diverging from upstream) and monthly per-team billing reconciliation reports.
Compliance vendor reviews. One-pager per third-party we depend on (Stripe, AWS, Microsoft Azure, Google Vertex, Anthropic, OpenAI, GitLab) describing what data we send them, their attestation status, and our DPA position. Available on request to enterprise prospects.
Customer-data-deletion runbook. The exact PIPEDA-flow we follow on a deletion request: scope, SQL, financial-record anonymization, Stripe customer cleanup, backup retention disclosure, audit-trail entry, confirmation email template. Available on request.
Audit log architecture. Every state-changing admin action records to an indefinite-retention audit_logs table with category, severity, IP, user-agent, full detail JSONB. Filterable + exportable.

What we're transparent about not yet having

We're not SOC 2 Type II certified yet. The roadmap is gated on customer demand: when an enterprise prospect asks, we engage Drata or Vanta for continuous evidence collection and start the 6-month observation window with an A-LIGN or Schellman audit at the end. Today, our compliance posture inherits from the cloud providers we run on (AWS SOC 2 Type II, ISO 27001, PCI-DSS attestations are downloadable from AWS Artifact and we mirror them quarterly into our own compliance archive).

HIPAA BAAs aren't yet signed. Free on AWS / Azure / GCP, but a signal we wait until a healthcare customer asks. Same for ISO 27001 and the EU AI Act framework.

Questions?

Email trust@northerninference.ca for compliance questions, vendor reviews, or audit-evidence requests. Email abuse@northerninference.ca if you've discovered a security issue.