Docs menu

Integrations

Agent frameworks

RateGuard wraps the HTTP client your LLM SDK already uses — so it plugs into any framework that lets you pass a custom client or fetch. One line each, verified against each framework's official documentation.

Why integrate at the wire, not in the framework

Framework token counting is unreliable today: LangChain reports incorrect counts in streaming mode (langchain#30429) and CrewAI's token_usage disagrees with provider counts. RateGuard counts below the framework at the transport layer — the numbers are whatever the provider actually returned. Budgets, breakers, and fallback come along for free.

LangChain / LangGraph (Python)

Python
from langchain_openai import ChatOpenAI
from rateguard import RateGuard

rg = RateGuard(preset="agent-orchestrator")

llm = ChatOpenAI(
    model="gpt-4o",
    http_client=rg.wrap_httpx_client(),          # sync path
    http_async_client=rg.wrap_httpx_async_client(),  # async path
)
# Use llm inside any LangGraph graph — every call is budgeted and metered.

OpenAI Agents SDK (Python)

Python — one global line
from agents import set_default_openai_client
from openai import AsyncOpenAI

set_default_openai_client(
    AsyncOpenAI(http_client=rg.wrap_httpx_async_client())
)

Vercel AI SDK (TypeScript)

TypeScript — provider fetch is the official middleware surface
import { createOpenAI } from '@ai-sdk/openai';
import { RateGuard } from '@varbees/rateguard-node';

const rg = new RateGuard({ preset: 'agent-orchestrator' });
const openai = createOpenAI({ fetch: rg.wrapFetch() });

const { text } = await generateText({ model: openai('gpt-4o'), prompt });

Works identically for createAnthropic, createGroq, and every OpenAI-compatible AI SDK provider — they all accept fetch. Mastra models are AI SDK providers, so the same line covers Mastra.

Pydantic AI (Python)

Python
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIModel(
    "gpt-4o",
    provider=OpenAIProvider(http_client=rg.wrap_httpx_async_client()),
)

Go frameworks

Go — the pattern is universal
rg := rateguard.New(rateguard.Config{Preset: "agent-orchestrator"})
httpClient := rg.WrapClient(&http.Client{})

openai := openai.NewClient(option.WithHTTPClient(httpClient))
claude := anthropic.NewClient(option.WithHTTPClient(httpClient))

CrewAI — honest status

Not yet

CrewAI's native provider path does not currently expose custom HTTP client injection (crewAI#5139). We track client-injection support and will publish a recipe the day it lands. Pointing CrewAI's LiteLLM fallback at infrastructure you control is a proxy pattern — not what RateGuard recommends.

What every integration gets

CapabilityHow
Real token usage per callExtracted from the provider's response — JSON and SSE streaming
Token budgets (hr/day/mo)Scoped {tenant}:{provider}:{model}:outbound, reserve → commit
Per-provider circuit breakersAn OpenAI outage doesn't trip DeepSeek
EnforcementSynthesized provider-native 429/503 with Retry-After — SDK retry logic just works
FallbackOpenAI-compatible providers, credential-isolated
Pre-flight queriesMCP tools — agents ask before they spend
MetricsPrometheus /metrics: outbound calls, fallbacks, tokens consumed