AI inside Corporation (CEO: Taku Watakuchi, Headquarters: Minato-ku, Tokyo, hereinafter "AI inside") has developed a Full-Duplex voice interaction model that simultaneously processes human conversation and task execution.

This research and development is based on the research theme "Research and Development of a Consistent Japanese Full-Duplex Speech Multimodal LLM," which was adopted for the GENIAC (Generative AI Accelerator Challenge) project, aiming to strengthen generative AI development capabilities in Japan, conducted by the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization (NEDO).

Technical Features of the Full-Duplex Voice Interaction Model

### ① Simultaneous Processing of Dialogue and Task Execution—Full-Duplex Voice Interaction This model supports Full-Duplex voice interaction, capable of capturing user intent mid-utterance and immediately starting response generation and task processing. While conventional voice AIs start processing after the utterance is complete, this model proceeds with processing during the utterance. This enables real-time conversational responses.

**Casual Conversation** Responds by instantly changing utterance content according to the flow of the conversation.

**Work Consultation** Generates non-verbal expressions such as laughter in real-time, in addition to confirmation responses.

**Travel Consultation** Maintains calm dialogue by naturally controlling the timing and intensity of interjections.

### ② Image Understanding for Recognizing Present Information A mechanism for comprehensively processing images, audio, and text with a single model has been realized. In evaluations for describing image content in Japanese, it showed approximately 6.1 times higher explanation accuracy compared to Qwen3-8B-VL.

FACT BOX

  • Source: PR TIMES
  • Category: New Product
  • Organizations: NEDO
  • Products / services: GENIAC