What is a Multimodal Large Language Model (LLM)?

A Multimodal LLM is an AI technology capable of simultaneously processing multiple types of data, such as text, images, audio, and video. This allows it to perform tasks that involve understanding and integrating information from various sources.

What is the main feature of Ricoh's new model, Qwen3-VL-Ricoh-32B-20260227?

The main feature is its 'reasoning capability,' enabling it to accurately understand diverse documents, including charts and diagrams, through multi-stage reasoning. It can associate information across multiple pages and answer complex questions.

What is the lightweight model being released?

Ricoh is releasing a lightweight model named 'Qwen3-VL-Ricoh-8B-20260227' free of charge. This model utilizes the technologies developed for the main model and is intended for wider accessibility.

Where can I access the lightweight model?

The lightweight model is available on Hugging Face at the following URL: https://huggingface.co/ricoh-ai/Qwen-3-VL-Ricoh-8B-20260227

Ricoh Develops Multimodal Large Language Model with Reasoning Capabilities in GENIAC Phase 3

Q: What is GENIAC?

GENIAC (Generative AI Accelerator Challenge) is a project led by the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization (NEDO) in Japan, aimed at strengthening domestic generative AI development capabilities.

Published Mar 30, 2026 8:10 PM ・ Updated Jun 2, 2026 1:01 PM ・ 8 min read ・ Source: PR TIMES

Ricoh has developed a multimodal LLM for the GENIAC project that excels at understanding charts and diagrams. They are releasing a lightweight model for free and will also release a benchmark tool.

Ricoh Company, Ltd. (President and Representative Director: Akira Oyama) today announced the completion of the development of a multimodal large language model (LLM) with reasoning capabilities, named "Qwen3-VL-Ricoh-32B-20260227." This model is designed to accurately read diverse documents, including charts and diagrams, as part of the third phase of the "GENIAC (Generative AI Accelerator Challenge)" project, an initiative by the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization (NEDO) aimed at strengthening domestic generative AI development capabilities. A key feature of this model is its ability to understand complex documents through multi-stage reasoning. Additionally, Ricoh is releasing a lightweight model, "Qwen3-VL-Ricoh-8B-20260227," utilizing the technologies applied in the development of the main model, free of charge starting today. Furthermore, Ricoh's proprietary benchmark tool*3, specifically designed for evaluating reasoning performance, is also scheduled for future release. 【Release Location】 https://huggingface.co/ricoh-ai/Qwen-3-VL-Ricoh-8B-20260227

1. Background and Societal Challenges of the Initiative LLMs (Large Multimodal Models) are AI technologies capable of simultaneously processing multiple types of data, including text, images, audio, and video. They are highly anticipated as AI that can handle a wide range of data formats, demonstrating high performance in various tasks such as summarizing text from screenshots and answering questions that involve charts and diagrams. Companies accumulate diverse documents, including transactional data like invoices and receipts, management materials such as business strategies and plans, service manuals, and internal technical standards and quality control criteria. These documents contain not only text but also figures, tables, and images, and there is an expectation for their efficient utilization within companies and for generating new value and innovation. However, challenges such as "text searches not yielding intended results" and "difficulty in fully utilizing documents with search functions alone" have been pointed out. Furthermore, in recent years, there has been a growing demand to address management challenges such as efficient work styles in response to a declining workforce, skill transfer due to the retirement of veteran employees, and the multilingualization of documents due to the increase in foreign workers. Against this backdrop, the need to efficiently utilize corporate knowledge using AI is increasing.

In the second phase of GENIAC, which began in August 2024, Ricoh developed a 70 billion parameter LLM and released its base model and a proprietary benchmark tool free of charge. Additionally, in January 2026, Ricoh developed a compact 32 billion parameter LLM based on Alibaba Cloud's "Qwen2.5-VL-32B-Instruct," part of their LLM family.

2. This Achievement In the third phase, based on "Qwen3-VL-32B-Instruct*4," Ricoh developed the base model "Qwen3-VL-Ricoh-32B-20260227" for a reasoning LLM capable of understanding complex documents with high accuracy through multi-stage reasoning. Through techniques such as reinforcement learning*5 and curriculum learning*6, this model can associate and understand figures and tables spanning multiple pages, and generate highly accurate answers even to questions with high reading difficulty. In reinforcement learning, a unique reward function was set to enhance learning efficiency while suppressing overfitting. Curriculum learning involved optimizing difficulty settings and learning pace. Through these efforts, benchmark results comparable to large commercial models such as "Gemini2.5-Pro" have been confirmed (as of February 17, 2026). To evaluate the reasoning performance of this model, Ricoh developed its own benchmark tool.

FACT BOX

Source: PR TIMES
Category: New Product
Products / services: Qwen3-VL-Ricoh-32B-20260227 / Qwen3-VL-Ricoh-8B-20260227

Ricoh Develops Multimodal Large Language Model with Reasoning Capabilities in GENIAC Phase 3

⚡ Key Points

FACT BOX

Editorial & Verification Standards

FAQ

Cite this article — HOW TO CITE

AI CRAWLER ACTIVITY