Streaming Modes, Support System & SDK Improvements
Streaming Modes, Support System & SDK Improvements
Two new streaming modes, a full AI-powered support system, and SDK enhancements across the board.
Streaming Modes
- Buffered mode (default) — Smooth ~100ms paced chunks for polished UX. Set
stream_mode: "buffered"or omit for default behavior. - Realtime mode — Minimal ~10ms buffering for lowest-latency delivery. Set
stream_mode: "realtime"for interactive applications. - Both modes work with the native
/api/v1/chatendpoint and the OpenAI-compatible/v1/chat/completionsendpoint. - SDK support — Python:
client.chat_stream("...", stream_mode="realtime")/ Node:client.chatStream("...", { streamMode: "realtime" }).
AI-Powered Support
- Live chat — Built-in support chat with AI-first responses powered by RoutePlex's own models. Get instant answers about the product, docs, and pricing.
- Human escalation — Escalate to a human agent at any time if the AI can't resolve your question.
- Conversation history — Your past support conversations are saved across sessions so you never lose context.
SDK Improvements
- Streaming in both SDKs —
chat_stream()(Python) andchatStream()(Node) with buffered/realtime mode support. - Richer model metadata —
list_models()now returnspricing,capabilities,aliases,deprecated, anddeprecation_datefields. - New examples — Streaming examples added to Python SDK, Node SDK, and raw API (Python, JS, TypeScript, cURL).
- Free endpoints example — Node SDK now includes a
free_endpoints.mjsexample covering cost estimation, prompt enhancement, and model listing.
Tracking & Observability
- Routing mode tracking — Every request now logs whether it used
autoormanualrouting. - Stream mode tracking — Every request now logs whether streaming was used.
- Exports include mode data — CSV and JSON exports now include
routing_modeandstreamedfields. - API key last used — The "Last Used" timestamp on API keys now updates correctly after each request.