Sovereignty you can audit. Cost transparency you can recompute. A public log of what we deliberately chose not to build.
Most "Canadian AI" claims are a logo and a vague promise about where the servers live. Ours is enforced on every single API call. Three things make that possible:
Every request returns headers that record what region it traversed, what jurisdiction's law applied, and which provider handled it. Your own SDK can log them, your auditor can verify them. No "trust us" required.
If a request ever resolved to a different region than the deployment claimed, the headers would say so. There is nowhere for sovereignty to hide.
Before each call dispatches, the gateway resolves the upstream hostname's IPs and checks them against the cloud provider's own published prefix range for that region. If the resolved IPs don't belong to the expected jurisdiction's range, we record a drift event and (in enforce mode) reject the request with HTTP 403 before any data leaves our infrastructure.
The source of truth is the cloud provider's public IP-range catalog, refreshed daily. Our enforcement runs in 5-15 ms typical, in-process. Drift events are visible in the operator's admin UI as a live count and click-through table.
A deployment tagged for Canadian residency is enforced at the routing layer. If the underlying region or SKU resolves to a different jurisdiction, the deployment is silently REFUSED from serving traffic rather than quietly routing to the wrong region. Operators see the mismatch in an admin page; customers see a clear error pointing at a known-good alternative.
"0% markup" is hand-waving in the LLM space. Ours is auditable. Every single request stamps two cost components into your portal: what the upstream provider charged us, and what Northern Inference charged you. Sum either column. They reconcile.
Our model pricing is cross-validated against five authoritative sources: AWS Pricing API, Azure Retail Prices, OpenRouter, Helicone, and the litellm-live catalog. They agree on most rows; when they don't, our hourly sanity worker flags the divergence and emails the operator BEFORE customers see a wrong invoice. If a stored rate diverges by more than 2× from any authoritative source, an alert fires.
At the end of every billing cycle, an automated job joins our request-log charges against authoritative rates and looks for systematic under-billing — periods where our rates were wrong and we ate the difference. The findings live in an admin report. Stefan reviews them. Customers don't have to.
When a new model rate is added, the gate refuses anything more than 2× off comparable rows. This catches accidental overrides at the moment they'd cause damage rather than at end-of-month.
request_logs for any period via the portal. SUM the provider_cost_cents column — that's the wholesale cost. SUM the ni_fee_cents column — that's our charge. Add them. They match the line item on your statement to the cent.
Most pitches in our space claim coverage of every threat model. We document our refusals out loud, with the reasoning. Reading our deferred-decision log is itself a trust signal.
An example. Our jurisdiction drift detection has two enforcement vectors: pre-call DNS+IP-geo verification (shipped) and HTTP 30x redirect interception during the upstream call (deferred by decision). The redirect interception addresses a contrived threat model. Vertex's regional load balancer uses Anycast origin failover, not HTTP redirects. Bedrock's control plane uses regional endpoints with no cross-region redirects. Azure Cognitive Services the same. We had no realistic way to test the deferred half against actual drift, and implementing it would have required deeper access to a black-box dependency than we'd want to take on for a threat that doesn't exist in production. So we shipped the half that maps to the real failure mode (DNS pointing at unexpected IPs after a regional reroute) and wrote down the choice. If a customer audit asks for in-flight redirect interception, or if we ever observe a real redirect-based drift event in the wild, we'll reopen.
The full deferred-decision log lives in our internal docs. We will publish a public summary when we reach the SOC 2 audit gate; until then, this and other examples will surface here as they come up.
audit_logs table with category, severity, IP, user-agent, full detail JSONB. Filterable + exportable.We're not SOC 2 Type II certified yet. The roadmap is gated on customer demand — when an enterprise prospect asks, we engage Drata or Vanta for continuous evidence collection and start the 6-month observation window with an A-LIGN or Schellman audit at the end. Today, our compliance posture inherits from the cloud providers we run on (AWS SOC 2 Type II, ISO 27001, PCI-DSS attestations are downloadable from AWS Artifact and we mirror them quarterly into our own compliance archive).
HIPAA BAAs aren't yet signed. Free on AWS / Azure / GCP, but a signal we wait until a healthcare customer asks. Same for ISO 27001 and the EU AI Act framework.
Email trust@northerninference.ca for compliance questions, vendor reviews, or audit-evidence requests. Email abuse@northerninference.ca if you've discovered a security issue.