Antharmaya Labs · Open Source · MIT
AI agents that know
their limits.
RateGuard is the first agent-native rate limiting middleware. Go, Node, and Python SDKs with identical behavior. MCP tools let AI agents query their own limits before making calls. No proxy, no extra infrastructure, no added latency.
What makes it different
Agent-Native MCP Tools
5 MCP tools + zero-dependency JSON-RPC stdio server. Claude Code, Cursor, or any MCP client queries RateGuard BEFORE making calls. The agent asks permission — no more 429 errors.
Outbound Transport Tracking
Inbound middleware guards your API. The outbound RoundTripper tracks real LLM spend. WrapClient() wraps http.Client — every OpenAI/Anthropic/Google call gets budgeted, traced, and metered.
Loop Detection
SHA-256 payload fingerprinting via X-Sequence-Depth header. Runaway agent loops get halted before they torch your budget. Bounded LRU cache — no memory leaks.
Multi-Language Parity
Go, Node.js, Python — same token bucket algorithm, same APIs, same presets. Every feature claim has passing tests. 155 tests across three SDKs, all wired end-to-end.
Zero Infrastructure
No proxy. No extra service. No third-party dependency. RateGuard runs inside your app process. Drop a middleware and every LLM call becomes transparent.
Provider Fallback
Automatic failover across OpenAI-compatible providers when one goes down. Circuit breakers open, provider chain routes to the next — transparent to your application.
One line to ship
// Track every outbound LLM call with one line
client := rg.WrapClient(&http.Client{})
// Agent queries its own limits before calling
get_rate_limit_state("user-123") → {remaining: 47, limit: 100}
// Stop runaway loops before they burn your budget
X-Sequence-Depth: 3 → SHA-256 fingerprint → haltedOpen source. MIT license. Free forever.
Built by a solo founder in India. 155 commits, 155 tests, 3 languages. Every feature is wired and tested.
Star on GitHub↗