LangChain / LlamaIndex

Use Northern Inference from LangChain / LlamaIndex

LangChain (Python)

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="anthropic/claude-sonnet-4.5",
    api_key="ni_live_YOUR_KEY_HERE",
    base_url="https://northerninference.ca/v1",
    model_kwargs={
        "extra_body": {"privacy_tier": "managed_canadian_cloud"},
    },
)

resp = llm.invoke("What model are you?")
print(resp.content)

LangChain (JS/TS)

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  model: "anthropic/claude-sonnet-4.5",
  apiKey: process.env.NI_API_KEY,
  configuration: {
    baseURL: "https://northerninference.ca/v1",
  },
  modelKwargs: {
    extra_body: { privacy_tier: "managed_canadian_cloud" },
  },
});

const resp = await llm.invoke("What model are you?");
console.log(resp.content);

LlamaIndex

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="anthropic/claude-sonnet-4.5",
    api_key="ni_live_YOUR_KEY_HERE",
    api_base="https://northerninference.ca/v1",
    additional_kwargs={
        "extra_body": {"privacy_tier": "managed_canadian_cloud"},
    },
)

print(llm.complete("What model are you?"))

Capturing the custody chain in your app

Every NI response carries custody headers. LangChain doesn't surface these through invoke(), but you can subclass ChatOpenAI or use .ainvoke_with_config() with a callback handler to capture them:

import httpx
from langchain_openai import ChatOpenAI

class NICustodyCapturingChatOpenAI(ChatOpenAI):
    last_custody_path: str = ""
    last_request_id: str = ""
    last_jurisdiction: str = ""

    def _get_http_client(self):
        transport = httpx.HTTPTransport()
        def log_response(response: httpx.Response):
            self.last_custody_path = response.headers.get("x-ni-custody-path", "")
            self.last_request_id = response.headers.get("x-ni-request-id", "")
            self.last_jurisdiction = response.headers.get("x-ni-resolved-jurisdiction", "")
        return httpx.Client(event_hooks={"response": [log_response]})

Or, simpler: use GET /api/usage/custody/{request_id} after each call to pull the full chain. The request_id is available on any failure via the x-ni-request-id header in the 4xx/5xx response too.


Source: tests/user_run_tests/integrations/langchain.md. Spot a problem? Let us know.