FlashLabs Launches OrcaRouter Monthly Plan to Reduce AI Costs by Up to 40% with 200+ Models

FlashLabs has launched a monthly plan for OrcaRouter, an adaptive inference gateway that reduces AI costs by up to 40%. It intelligently routes prompts among 200+ LLMs like Claude Opus 4.8 and GPT-5.5 Pro.
新製品NQ 88/100出典:PR Times

📋 Article Processing Timeline

  • 📰 Published: June 4, 2026 at 04:40
  • 🔍 Collected: June 3, 2026 at 19:55
  • 🤖 AI Analyzed: June 4, 2026 at 00:29 (4h 33m after Collected)
## OrcaRouter Monthly Plan Launched

FlashLabs (HQ: Chiyoda-ku, Tokyo; CEO: Yoichi Hosoi) launched the monthly plan for its adaptive inference gateway, "OrcaRouter," in the Japanese market on June 3, 2026. The service maintains flagship-level output quality while reducing AI costs by up to 40%.

Through a monthly subscription, users gain access to over 200 LLM models—including Claude Opus 4.8 API, OpenAI GPT-5.5 Pro API, and Gemini 3.5 API—with up to a 10% bonus credit.

### Background and Objectives

While the enterprise AI market is expanding in 2026, soaring AI costs remain a critical challenge. Companies are overspending by directing all prompts to expensive frontier models. Furthermore, building custom routing solutions creates a maintenance burden with every new model release. As of 2026, 37% of enterprises use five or more models in production, marking a shift in the AI routing market from simple cost-cutting to "intelligent routing per prompt."

### Overview of OrcaRouter

OrcaRouter is a next-generation AI gateway that routes requests based on quality assessment.

- **Adaptive Routing**: Assesses prompt complexity to route advanced inference to frontier models and routine processing to open models.
- **Support for 200+ Models**: Includes Claude Opus 4.8, GPT-5.5 Pro, Gemini 3.5, DeepSeek V4 Pro, Qwen3.6-plus, and Kimi K2.6.
- **LinUCB Contextual Bandit**: Learns from request results to automatically reduce routing to underperforming models.
- **Routing Latency <1ms**: High-speed assessment that preserves user experience.
- **Full Visibility**: Records judgment results, models, and pricing per request.

### Value to Enterprises

1. **Reduce AI Costs by Approximately 40%**
By routing routine tasks (approx. 65% of total) to open models that cost 1/15th the price of frontier models, enterprises can achieve annual net savings of approximately $47,700.

2. **Transparent Pricing**
Token billing matches provider pricing (0% markup) with 0% routing fees. Monthly plans automatically provide up to 10% bonus credits.

3. **One-Line Integration**
Compatible with OpenAI APIs, integration requires only updating the base_url in existing code. It integrates directly into workflows like Cursor, Cline, and LangChain.

FAQ

OrcaRouterとはどのようなサービスですか?

プロンプトの難易度に応じて最適なLLMを自動的に振り分ける、適応型推論ゲートウェイです。フロンティアモデルとオープンモデルを使い分けることで、品質を維持しながらコストを削減します。

OrcaRouterで利用可能なモデルは?

Claude Opus 4.8、OpenAI GPT-5.5 Pro、Gemini 3.5、DeepSeek V4 Pro、Qwen3.6-plus、Kimi K2.6など、200以上のLLMに対応しています。

導入にはどれくらいの工数がかかりますか?

OpenAI互換のAPIを採用しているため、既存アプリケーションのbase_urlを変更するだけで導入可能です。1行で導入できる設計となっています。

コスト削減の仕組みは?

定型処理(全体の約65%)を低コストなオープンモデルへ、高度な推論(全体の約35%)をフロンティアモデルへ自動ルーティングすることで、年間で約$47,700のコスト削減を見込めます。

月額プランの特典はありますか?

最大10%のボーナスクレジットが毎月自動付与され、フロンティアモデルを実質的に最大10%割引で利用できます。