High-Speed, High-Accuracy Specialized AI Achieves Low-Cost On-Premise Operation: Document Comprehension AI Foundation Optimized for Social Implementation Released

Stockmark has released 'Stockmark-DocReasoner-Qwen2.5-VL-32B', a document comprehension AI foundation optimized for social implementation, enabling low-cost, high-accuracy on-premise operation and protecting data sovereignty.
新製品NQ 8/100出典:PR Times

📋 Article Processing Timeline

  • 📰 Published: April 8, 2026 at 20:00
  • 🔍 Collected: April 8, 2026 at 11:31
  • 🤖 AI Analyzed: April 20, 2026 at 17:26 (293h 54m after Collected)
Stockmark Inc. (Headquarters: Minato-ku, Tokyo; CEO: Tatsu Hayashi; hereinafter: 'our company'), which independently develops domestic generative AI foundations and provides business-oriented generative AI services, announced the release of 'Stockmark-DocReasoner-Qwen2.5-VL-32B', a document comprehension foundation model developed with support from the third phase of the 'GENIAC' project, an initiative by the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization (NEDO) aimed at strengthening Japan's generative AI development capabilities.

The model developed this time utilizes our company's know-how in developing proprietary AI foundation models, achieving the characteristics of hallucination suppression and expertise in Japanese/business within a 32B (32 billion parameter) mid-sized model.

This enables low-cost on-premise operation, which is difficult with general-purpose AIs, and allows for secure processing of highly confidential corporate data without sending it externally. Thus, it is a foundation model optimized for social implementation that achieves a high-speed, high-accuracy generative AI environment while fully protecting companies' 'data sovereignty'.

**Model Release Location**
**Stockmark-DocReasoner-Qwen2.5-VL-32B**
https://huggingface.co/stockmark/Stockmark-DocReasoner-Qwen2.5-VL-32B


**About 'Stockmark-DocReasoner-Qwen2.5-VL-32B'**
'Stockmark-DocReasoner-Qwen2.5-VL-32B' is a multimodal foundation model that has been extensively trained on complex business documents, including charts and images, and highly complex documents characteristic of industries like manufacturing, by adding to the open model 'Qwen2.5-VL-32B-Instruct'.

This model not only exhibits high performance in reading complex documents commonly used in business scenarios but also in reading documents containing specialized knowledge unique to the manufacturing industry, such as chemical formulas and blueprints.

Its performance surpasses that of the 'Qwen2.5-VL-32B-Instruct' adopted as the base model, as well as 'Qwen3-VL-32B-Instruct', which is widely used in Japan. Furthermore, by realizing 'Chain of Thought (CoT)' generation, which produces intermediate thought processes step-by-step rather than outputting the final answer all at once when answering complex questions, generative AI can be reliably utilized in business scenarios.

**Features of 'Stockmark-DocReasoner-Qwen2.5-VL-32B'**
As a feature of this model, we have adopted a strategic size of 32B (32 billion parameters) to break through the dilemma of 'exploding operational costs' faced by conventional large-scale models and the 'accuracy limitations' of lightweight models.

32B and