Integrations

Agent frameworks

RateGuard wraps the HTTP client your LLM SDK already uses — so it plugs into any framework that lets you pass a custom client or fetch. One line each, verified against each framework's official documentation.

Why integrate at the wire, not in the framework

Framework token counting is unreliable today: LangChain reports incorrect counts in streaming mode (langchain#30429) and CrewAI's token_usage disagrees with provider counts. RateGuard counts below the framework at the transport layer — the numbers are whatever the provider actually returned. Budgets, breakers, and fallback come along for free.

LangChain / LangGraph (Python)

Python

from langchain_openai import ChatOpenAI
from rateguard import RateGuard

rg = RateGuard(preset="agent-orchestrator")

llm = ChatOpenAI(
    model="gpt-4o",
    http_client=rg.wrap_httpx_client(),          # sync path
    http_async_client=rg.wrap_httpx_async_client(),  # async path
)
# Use llm inside any LangGraph graph — every call is budgeted and metered.

OpenAI Agents SDK (Python)

Python — one global line

from agents import set_default_openai_client
from openai import AsyncOpenAI

set_default_openai_client(
    AsyncOpenAI(http_client=rg.wrap_httpx_async_client())
)

Vercel AI SDK (TypeScript)

TypeScript — provider fetch is the official middleware surface

import { createOpenAI } from '@ai-sdk/openai';
import { RateGuard } from '@varbees/rateguard-node';

const rg = new RateGuard({ preset: 'agent-orchestrator' });
const openai = createOpenAI({ fetch: rg.wrapFetch() });

const { text } = await generateText({ model: openai('gpt-4o'), prompt });

Works identically for createAnthropic, createGroq, and every OpenAI-compatible AI SDK provider — they all accept fetch. Mastra models are AI SDK providers, so the same line covers Mastra.

Pydantic AI (Python)

Python

from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIModel(
    "gpt-4o",
    provider=OpenAIProvider(http_client=rg.wrap_httpx_async_client()),
)

Go frameworks

Go — the pattern is universal

rg := rateguard.New(rateguard.Config{Preset: "agent-orchestrator"})
httpClient := rg.WrapClient(&http.Client{})

openai := openai.NewClient(option.WithHTTPClient(httpClient))
claude := anthropic.NewClient(option.WithHTTPClient(httpClient))

CrewAI — honest status

Not yet

CrewAI's native provider path does not currently expose custom HTTP client injection (crewAI#5139). We track client-injection support and will publish a recipe the day it lands. Pointing CrewAI's LiteLLM fallback at infrastructure you control is a proxy pattern — not what RateGuard recommends.

What every integration gets

Capability	How
Real token usage per call	Extracted from the provider's response — JSON and SSE streaming
Token budgets (hr/day/mo)	Scoped `{tenant}:{provider}:{model}:outbound`, reserve → commit
Per-provider circuit breakers	An OpenAI outage doesn't trip DeepSeek
Enforcement	Synthesized provider-native 429/503 with `Retry-After` — SDK retry logic just works
Fallback	OpenAI-compatible providers, credential-isolated
Pre-flight queries	MCP tools — agents ask before they spend
Metrics	Prometheus `/metrics`: outbound calls, fallbacks, tokens consumed