Plexus

Plexus

A unified LLM API gateway handling protocol translation, failover, and usage tracking. Maintained by Matt Cowger (@mcowger on the Synthetic Discord). The Synthetic Discord has a #plexus channel thread dedicated to Plexus support.

Currently the most practical way to use multiple AI providers through a single endpoint without rewriting client code per API format. Routes OpenAI, Anthropic, Gemini, and any OpenAI-compatible provider through one interface.

Also supports OAuth-backed providers (GitHub Copilot, Claude, Codex, Gemini CLI) without API key management. Currently the only gateway with built-in vision fallthrough for non-vision models.

Pros:

Cons:

Vision Fallthrough

Routing & Selection

Model aliases backed by multiple providers:

Selector Behavior
———-———-
random Distribute across healthy targets (default)
in_order Failover sequence
cost Cheapest provider wins
performance Highest tokens/sec
latency Lowest time-to-first-token

priority: api_match enables pass-through for same-format providers, skipping transformation overhead.

Protocol Translation

Bidirectional conversion between:

  1. OpenAI chat completions (/v1/chat/completions)
  2. OpenAI responses (/v1/responses)
  3. Full OpenAI /v1/responses support: stateful multi-turn via previous_response_id, 7-day TTL storage, function calling.
  4. Anthropic messages (/v1/messages)
  5. Google Gemini native format (/v1beta)

Handles streaming SSE and tool use normalization across all formats.

Deep Inspection

Pros:

Cons:

Quota Enforcement

Provider Cooldowns

MCP Proxy

Model Context Protocol server proxy with per-request session isolation.

Encryption at Rest