Back to Northern Inference

Pricing

Live per-token rates for every model on the platform. Updated automatically from upstream provider catalogs.

Per-request transaction model. Each API request is a discrete transaction priced at the rate captured by our pricing service at the moment of metering. Rates listed here are informational and refresh from upstream catalogs periodically (up to 24 hours of lag). Customers are charged at the rate in effect when the request is metered, not the rate displayed at submission time. See Terms §3 for the full description, including the manifest-error and rate-source clauses.
Model Provider Tier Residency Input CA$/M Output CA$/M Cache read CA$/M Cache write CA$/M
Loading…