Get started

Quickstart

Five minutes to a protected API and a metered LLM client. Pick your language once — every code block on this site remembers it.

1. Install

go get github.com/varbees/rateguard/packages/sdk-go

2. Rate limit your API (inbound)

Create a RateGuard instance from a preset and mount the middleware for your framework:

import rateguard "github.com/varbees/rateguard/packages/sdk-go"

rg := rateguard.New(rateguard.Config{Preset: "streaming-llm"})

// net/http
http.Handle("/", rg.HTTPMiddleware(myHandler))
// Prometheus metrics
http.Handle("/metrics", rg.Metrics())

3. Track your LLM spend (outbound)

Wrap the HTTP client your LLM SDK already uses. Every call gets budgeted, breaker-protected per provider, and metered with real token usage:

client := rg.WrapClient(&http.Client{})
openai := openai.NewClient(option.WithHTTPClient(client))

Tip

This is the headline feature. Anthropic, Gemini, Vertex, Azure OpenAI, Bedrock, and 16 OpenAI-compatible hosts (DeepSeek, Groq, vLLM, …) are detected out of the box. See Track LLM spend for enforce vs observe modes and fallback chains.

4. Let your agents ask first (MCP)

Expose RateGuard's five pre-flight tools to any MCP client — so agents check limits before they spend:

// Zero-dependency MCP stdio server (JSON-RPC 2.0)
rg := rateguard.New(rateguard.Config{Preset: "agent-orchestrator"})
_ = rg.ServeMCP(ctx, os.Stdin, os.Stdout)

Then point Claude Code, Claude Desktop, or Cursor at it — full walkthrough in Agents & MCP.