v1.5.0LatestApril 5, 2026

Streaming Modes, Support System & SDK Improvements

Two new streaming modes, a full AI-powered support system, and SDK enhancements across the board.

Streaming Modes

Buffered mode (default) — Smooth ~100ms paced chunks for polished UX. Set stream_mode: "buffered" or omit for default behavior.
Realtime mode — Minimal ~10ms buffering for lowest-latency delivery. Set stream_mode: "realtime" for interactive applications.
Both modes work with the native /api/v1/chat endpoint and the OpenAI-compatible /v1/chat/completions endpoint.
SDK support — Python: client.chat_stream("...", stream_mode="realtime") / Node: client.chatStream("...", { streamMode: "realtime" }).

AI-Powered Support

Live chat — Built-in support chat with AI-first responses powered by RoutePlex's own models. Get instant answers about the product, docs, and pricing.
Human escalation — Escalate to a human agent at any time if the AI can't resolve your question.
Conversation history — Your past support conversations are saved across sessions so you never lose context.

SDK Improvements

Streaming in both SDKs — chat_stream() (Python) and chatStream() (Node) with buffered/realtime mode support.
Richer model metadata — list_models() now returns pricing, capabilities, aliases, deprecated, and deprecation_date fields.
New examples — Streaming examples added to Python SDK, Node SDK, and raw API (Python, JS, TypeScript, cURL).
Free endpoints example — Node SDK now includes a free_endpoints.mjs example covering cost estimation, prompt enhancement, and model listing.

Tracking & Observability

Routing mode tracking — Every request now logs whether it used auto or manual routing.
Stream mode tracking — Every request now logs whether streaming was used.
Exports include mode data — CSV and JSON exports now include routing_mode and streamed fields.
API key last used — The "Last Used" timestamp on API keys now updates correctly after each request.

v1.4.0March 24, 2026

Official SDKs & Prompt-Based Auto-Routing

Official Python and Node.js SDKs, smarter auto-routing, and reliability improvements.

Official SDKs

Python SDK — pip install routeplex — zero dependencies, Python 3.8+. PyPI · GitHub
Node.js SDK — npm install @routeplex/node — zero dependencies, Node 18+, full TypeScript types. npm · GitHub
One-liner chat completions, typed error hierarchy, and all API features including auto-routing, strategy routing, manual mode, cost estimation, prompt enhancement, and test mode.

Prompt-Based Auto-Routing

New default: Omit strategy and RoutePlex analyzes your prompt to pick the best model. Simple questions get fast, cheap models; complex tasks get powerful ones.
Previous behavior (strategy: "balanced") still works identically.

Improvements

model_used now always shows the actual model that served your request (e.g. gpt-4.1-nano) in all routing modes.
/api/v1/models response format simplified — data is now a flat array of model objects.
Feedback endpoint reliability fix.
Improved model selection accuracy for edge-case prompts.

v1.3.0March 16, 2026

Security Hardening & Auto Routing

Platform-wide security improvements and a new deep-dive on auto routing.

Security

Registration protection — Automated sign-up and bot abuse prevention.
Login hardening — Account lockout after repeated failed attempts with unified error responses.
Session management — Active session limits with automatic cleanup.
Rate limit improvements — Stricter limits on authentication endpoints with abuse prevention on the frontend.
Database optimization — Improved query performance for high-volume accounts.
Webhook reliability — Duplicate event detection for payment processing.

Auto Routing

Prompt-based selection — Every routeplex-ai request is matched to the best-fit model based on the prompt itself, with minimal overhead.
Routing strategies — balanced, quality, cost, and speed — pass your preference via the X-RoutePlex-Strategy header.
Automatic fallback — If the selected model fails, the router transparently retries with the next-best option.
Works with Self-Learning — Auto routing gets smarter over time when paired with Self-Learning Routing.
New blog post — Auto Routing: One Model String, Every Provider — deep dive into how it works, strategies, and code examples.

v1.2.0March 1, 2026

Prompt Enhancement & Self-Learning Routing

Two intelligence features that make every request better — automatically.

Prompt Enhancement

Automatic prompt rewriting — RoutePlex detects your query type and rewrites your prompt before it reaches the model. No extra API calls, no stored data, no configuration required.
Query-type detection — Specialized rewriting for code, debugging, analysis, writing, math, creative, planning, comparison, and more. Each type gets targeted improvements (language context, step-by-step framing, structure hints, etc.).
Complexity gating — Already-detailed prompts are passed through unchanged. Enhancement only fires when it would meaningfully help.
3 ways to enable — Per request via enhance_prompt: true, via the OpenAI SDK with the X-RoutePlex-Enhance: true header, or via the standalone /api/v1/chat/enhance endpoint (no API key required, free).
Stateless & private — Your prompt is processed in memory and immediately discarded. Nothing is stored. See the Prompt Enhancement docs.

Self-Learning Routing

Per-account performance profiles — After each successful request, RoutePlex records lightweight metadata (model used, query type, response quality signals, latency, cost). Over time this builds a personalized routing profile for your account.
Learned routing bias — The router tilts selection toward models that consistently produce better results for your workload on each query type. Bounded and incremental — designed to nudge, not override. No configuration needed.
Confidence gating — Learning only influences routing once there's enough usage for the signal to be reliable. Below that threshold, global aggregated patterns apply.
Explicit feedback — Rate individual responses (1–5 stars) via the Insights dashboard or the POST /api/v1/insights/feedback API. User ratings carry extra weight over the automatic quality signals.
Insights dashboard — New Insights tab in the dashboard shows per-model performance by query type, prompt enhancement effectiveness, routing influence stats, and cost optimization recommendations.
Full data control — All learning data belongs to you. Delete it at any time from dashboard settings or via DELETE /api/v1/insights/data. Routing immediately reverts to global defaults. See the Self-Learning docs.

Privacy

Both features are built with privacy as a constraint, not an afterthought:

Prompts and responses are never stored
Quality is inferred from response metadata (length, structure, token ratio) — not content
Message pattern detection uses one-way SHA-256 hashing
Learning data deletion is immediate and permanent

v1.0.0February 10, 2026

Launch: RoutePlex Public Beta

RoutePlex Public Beta

We are excited to launch RoutePlex into public beta! Here is everything included in this release:

Core Platform

Unified API Gateway — Single endpoint for 23 AI models across OpenAI, Anthropic, and Google Gemini
Intelligent Routing — Automatic model selection with routeplex-ai mode and 4 strategies (balanced, quality, cost, speed)
OpenAI SDK Compatibility — Use the official OpenAI SDK with any model — just change the base_url
Automatic Failover — Multi-provider retry logic for 99.9%+ effective uptime
Content Moderation — Built-in 3-layer safety pipeline (pattern detection → AI classification → URL blocklist) on all requests
Web Search — Automatic web search, auto-detected from prompt content — no extra parameters needed
URL Fetching — Automatic URL content extraction when URLs are detected in prompts
Stateless Architecture — Zero prompts stored, zero responses logged. Complete data privacy by design

Models

OpenAI — GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, GPT-4o, GPT-4o Mini, o3, o3 Mini, o4 Mini
Anthropic — Claude Opus 4.6, Claude Opus 4, Claude Sonnet 4.5, Claude Sonnet 4, Claude Haiku 4.5, Claude 3 Haiku
Google — Gemini 3 Pro (Preview), Gemini 3 Flash (Preview), Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash Lite
RoutePlex — routeplex-ai intelligent routing with strategy selection

Billing & Usage

Micro-cent Billing — Usage tracked to the micro-cent level with real-time cost display
Evaluation Plan — Free tier with daily limits for testing and prototyping
Pay-as-you-go — Upgrade instantly with Stripe-powered billing
Cost Estimation — Free /chat/estimate endpoint for pre-request cost checks
Budget Controls — Daily token caps, cost limits, and monthly soft caps
Grace Period — 1-hour grace period when payment methods are removed
Invoice History — Detailed billing history with downloadable invoices
30-day Billing Cycles — Automatic billing with usage-based invoicing

Dashboard

Real-time Analytics — Live request, token, and cost monitoring with hourly/daily/monthly views
API Key Management — Create, rotate, and revoke keys with per-key usage tracking
Model Settings — Enable/disable models and control premium model access
Model Health — Real-time health status, latency, and deprecation warnings for all models
Billing Management — Manage payment methods, view invoices, and track spending
Account Settings — Theme preferences, notification settings, and security controls
Team Management — Invite members with role-based access (owner, admin, member)
Responsive Design — Full mobile and desktop dashboard support

Developer Experience

Interactive Playground — Test all models and routing modes in-browser with cost estimates
Comprehensive Docs — Quick start guides, API reference, and integration examples
Error Codes — Standardized error responses with actionable messages
Rate Limiting — Per-account RPM and concurrency controls with clear headers

v0.9.0February 5, 2026

Model Health, Security & Performance

New Features

Model Health Dashboard — Real-time health status, latency percentiles, and deprecation warnings for all models
Grace Period Billing — 1-hour grace period when removing payment methods to prevent accidental lockouts
Content Moderation Pipeline — 3-layer safety system: pattern detection, AI classification, and URL domain blocklist
Web Search Integration — Automatic web search, triggered by prompt content (no extra params needed)
URL Content Fetching — Auto-extracts content from URLs detected in prompts

Improvements

40% faster routing — Model selection engine optimized with in-memory caching, reducing routing overhead significantly
API key security — All API keys now stored with SHA-256 hashing. Keys are never stored in plaintext
Enhanced rate limiting — Per-account rate limiting with configurable RPM (requests per minute) and concurrency limits
Better error messages — Structured error responses with provider-specific error mapping and actionable suggestions

Bug Fixes

Fixed token counting accuracy for multi-byte Unicode characters
Resolved billing period boundary calculation for UTC edge cases
Fixed model health status not updating when provider returns 503

v0.8.0January 20, 2026

Alpha Release & Core Infrastructure

The foundational release of RoutePlex, establishing the core AI gateway infrastructure.

Initial Platform

Chat Completions API — OpenAI-compatible endpoint for multi-model routing with full request/response compatibility
Direct Mode — Specify exact models with provider prefix (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514)
RoutePlex AI Mode — Automatic model selection based on request characteristics and selected strategy
Evaluation Plan — Free tier with daily quotas for testing and prototyping
Dashboard — Usage monitoring, API key management, and basic analytics

Supported Models at Launch

OpenAI — GPT-4o, GPT-4o Mini, o3, o3 Mini
Anthropic — Claude Sonnet 4, Claude 3 Haiku
Google — Gemini 2.5 Pro, Gemini 2.5 Flash

Note: The model catalog has since been significantly expanded. See the current model list for up-to-date availability and pricing.

Infrastructure

Multi-region deployment for low-latency global access
Persistent data storage with automatic backups
In-memory rate limiting, caching, and session management
End-to-end TLS encryption on all API traffic
Zero-downtime database migrations

Changelog

Streaming Modes, Support System & SDK Improvements

Streaming Modes, Support System & SDK Improvements

Streaming Modes

AI-Powered Support

SDK Improvements

Tracking & Observability

Official SDKs & Prompt-Based Auto-Routing

Official SDKs & Prompt-Based Auto-Routing

Official SDKs

Prompt-Based Auto-Routing

Improvements

Security Hardening & Auto Routing

Security Hardening & Auto Routing

Security

Auto Routing

Prompt Enhancement & Self-Learning Routing

Prompt Enhancement & Self-Learning Routing

Prompt Enhancement

Self-Learning Routing

Privacy

Launch: RoutePlex Public Beta

RoutePlex Public Beta

Core Platform

Models

Billing & Usage

Dashboard

Developer Experience

Model Health, Security & Performance

Model Health, Security & Performance

New Features

Improvements

Bug Fixes

Alpha Release & Core Infrastructure

Alpha Release & Core Infrastructure

Initial Platform

Supported Models at Launch

Infrastructure