OrcaRouter Starts Offering DeepSeek V4 Pro API at 75% Discount, Accelerating Cost Optimization for Enterprise AI Agent Workflows

Q: OrcaRouterで提供開始された新しいAPIの価格は？

DeepSeek V4 Pro APIが入力$0.14/M tokens、出力$0.28/M tokensで提供されます（通常価格の75%割引）。トークン上乗せは0%です。

Q: OrcaRouterの主なコスト削減の仕組みは？

プロンプトの難易度を自動判定し、定型処理（約65%）は安価なオープンモデルへ、複雑な推論（約35%）は高性能なフロンティアモデルへ自動ルーティングします。

Q: どのくらいのコスト削減効果が見込めますか？

月1万ドル規模のLLM利用の場合、年間約47,700ドルの削減が見込めます。DeepSeek V4 Pro利用で従来比75%のコスト削減が可能です。

Q: 既存のシステムへの導入は簡単ですか？

OpenAI互換APIを提供しているため、既存コードの変更は1行のみで済み、即座に導入可能です。

Q: 障害時の対応や信頼性はどうなっていますか？

ミッドストリーム切り替えによる自動フェイルオーバー機能を備え、エージェントループの状態を維持しながら99.99%の稼働率を保証します。

May 26, 2026

FlashLabs Inc. has announced the availability of the DeepSeek V4 Pro API at a 75% discount through 'OrcaRouter', provided by its partner Continuum AI. By automatically routing tasks based on prompt difficulty, it achieves dramatic cost reductions in enterprise AI agent operations while maintaining quality.

新製品NQ 83/100出典：PR Times

📋 Article Processing Timeline

📰 Published: May 26, 2026 at 03:30
🔍 Collected: May 25, 2026 at 19:01
🤖 AI Analyzed: May 26, 2026 at 05:41 (10h 39m after Collected)

FlashLabs, Inc. (Headquarters: Chiyoda-ku, Tokyo; CEO: Yoichi Hosoi) announced that it will begin offering the DeepSeek V4 Pro API at a 75% discount on 'OrcaRouter', provided by its partner Continuum AI. This will reduce the operational costs of enterprise AI agent workflows and accelerate AI adoption by companies.

■ Background and Objectives

The year 2026 marks a turning point for the widespread adoption of enterprise AI agent workflows. According to Gartner's predictions, 40% of enterprise applications will incorporate AI agents by the end of 2026, making AI agent-based workflows the new standard for business automation.

On the other hand, the operational cost of LLMs (Large Language Models) has become a new management challenge for companies, as it continues to grow alongside product expansion. Many companies currently face a situation where they route all processing to high-performance models to ensure quality, thereby continuously paying high unit prices even for routine tasks that do not necessarily require such advanced models.

To address this issue, OrcaRouter has provided a mechanism that reduces LLM expenditure by approximately 40% while maintaining quality. It achieves this by determining the difficulty of each prompt and automatically routing complex reasoning tasks to frontier models and routine tasks to high-performance open models.

Following DeepSeek's official announcement of a 75% discount on the V4 Pro API, OrcaRouter will also begin offering it at the same price, further accelerating the widespread adoption of enterprise AI agent workflows.

■ Overview of OrcaRouter and DeepSeek V4 Pro API 75% Discount

Pricing:

- DeepSeek V4 Pro API: Input $0.14/M tokens, Output $0.28/M tokens (75% discount from the regular price)

- Token Markup: 0% (same as the provider's published price)

Key Features:

- Optimal model automatic routing based on prompt difficulty assessment (<1ms)

- Learning-based routing using LinUCB contextual bandits

- Request-level visibility (assessment basis, model, provider, price)

- 99.99% uptime guarantee through mid-stream switching

- OpenAI-compatible API, implementable with a single line of code

Supported Models:

- DeepSeek V4 Pro API

- Anthropic Claude Opus 4.7 API

- OpenAI GPT 5.5 API

- Over 200 other models provided via a single endpoint

■ Value Provided to Enterprises

1. Dramatic Cost Reduction

For an LLM usage scale of $10,000 per month, an annual cost reduction of approximately $47,700 is achieved (payback period of less than one day). Using DeepSeek V4 Pro enables a 75% cost reduction compared to conventional methods.

2. Optimization While Maintaining Quality

By automatically assessing prompt difficulty, routine tasks (approx. 65%) are processed by open models at about 1/15th of the cost, while advanced reasoning (approx. 35%) is handled by frontier models. This optimizes costs without compromising quality in any way.

3. Complete Transparency

Token billing is strictly at the provider's published price (0% markup). The assessment basis, model, provider, and price are visualized per request, ensuring auditability.

4. Enterprise-Grade Reliability

Automatic failover during provider outages is achieved via mid-stream switching. It guarantees 99.99% uptime while maintaining the state of the agent loop.

5. Immediate Implementation

Thanks to the OpenAI-compatible API, modifying existing code requires only a single line. Both deployment and rollback are low-cost, facilitating a smooth transition from proof-of-concept to production.

■ Support for Enterprise AI Agent Workflows

OrcaRouter is designed and optimized for enterprise AI agent workflows.

Characteristics of Agent Workflows:

- A mix of routine tasks (extraction, classification, formatting, simple summarization, etc.) and advanced processing (multi-step reasoning, long-context handling, code generation, etc.)

- Requires real-time decision-making and adaptation

- Multiple agents operating collaboratively

Optimizations Performed by OrcaRouter:

- Assesses difficulty on a per-prompt basis and automatically selects the optimal model

- Learns from execution results to continuously improve routing accuracy

- Performs failover while maintaining the agent's state

This allows companies to confidently deploy AI agents into production environments, accelerating the journey from business automation to autonomy.

Initiatives for Guardrails and Security

By applying the security controls required for production operations at the gateway before reaching the models, it addresses the security requirements demanded by enterprise production environments.

FAQ

OrcaRouterで提供開始された新しいAPIの価格は？

DeepSeek V4 Pro APIが入力$0.14/M tokens、出力$0.28/M tokensで提供されます（通常価格の75%割引）。トークン上乗せは0%です。

OrcaRouterの主なコスト削減の仕組みは？

プロンプトの難易度を自動判定し、定型処理（約65%）は安価なオープンモデルへ、複雑な推論（約35%）は高性能なフロンティアモデルへ自動ルーティングします。

どのくらいのコスト削減効果が見込めますか？

月1万ドル規模のLLM利用の場合、年間約47,700ドルの削減が見込めます。DeepSeek V4 Pro利用で従来比75%のコスト削減が可能です。

既存のシステムへの導入は簡単ですか？

OpenAI互換APIを提供しているため、既存コードの変更は1行のみで済み、即座に導入可能です。

障害時の対応や信頼性はどうなっていますか？

ミッドストリーム切り替えによる自動フェイルオーバー機能を備え、エージェントループの状態を維持しながら99.99%の稼働率を保証します。

Back to Newsroom (42)