What is included in the new Qlean Dataset?

It includes 72 bust-up videos of young Japanese individuals simulating online interviews, featuring self-PR free talk and scripted readings.

What kind of AI development can it be used for?

It is ideal for improving Automatic Speech Recognition (ASR), multimodal AI analysis including non-verbal cues (gaze, expression), and TTS development.

Can it be used commercially?

Yes. Consent has been obtained from all subjects, and it is provided as commercially usable data with cleared rights and no legal risks.

Qlean Dataset Launches "Japanese New Graduate Self-PR Video Dataset"

Published Apr 3, 2026 6:01 PM ・ Updated Jun 2, 2026 1:01 PM ・ 13 min read ・ Source: PR TIMES

Visual Bank launches a self-PR video dataset featuring young Japanese candidates, optimized for developing multimodal AI in HR tech and communication analysis via its Qlean Dataset service.

Visual Bank Inc. (Minato-ku, Tokyo; Masayuki Nagai, Representative Director and CEO) has begun offering the "Japanese New Graduate Self-PR Video Dataset" through its AI training data solution "Qlean Dataset," operated by its subsidiary amanaimages inc. This dataset is optimized for extracting dynamic human characteristics in job hunting scenarios and training advanced multimodal analysis models.

This dataset consists of video data and meta-information that faithfully reproduce the modern hiring scene where online interviews and video screenings have become widespread. It features young Japanese people, assuming the role of new graduate job seekers, talking to a camera about their strengths and personal episodes. It uses a frontal bust-up angle, which is common in online interviews, providing visual and audio information close to the actual selection environment.

The recorded content includes both a free-talk format, where the speaker's emotions and intonation are easily reflected, and a reading format of a specified script where the spoken content is fixed. This composition makes it suitable not only for improving the accuracy of Automatic Speech Recognition (ASR) but also for analyzing qualitative communication elements, such as non-verbal information like changes in gaze and facial expressions, and speech fluency. In addition, since this dataset allows for additional recording by assigning specific models, it can flexibly accommodate customized needs deep within the voice and language domain, such as expanding voice data narrowed down to specific attributes or securing long-form speech data.

This data is provided as one of the "AI Data Recipes," an original data lineup for AI development deployed by Qlean Dataset. It strongly supports AI projects aimed at social implementation, from the deployment of next-generation hiring support AI to the development of educational and training products for selection preparation. Visual Bank and amanaimages will continue to support the research and development of AI that accurately understands and analyzes human behavior by providing structured data capturing a wide variety of scenes in Japan.

Overview of the "Japanese New Graduate Self-PR Video Dataset" - Data Type: Video - Subject Attributes: Japanese (young demographic assuming new graduate job seekers), gender information included - Data Size: 5,764.40MB - Number of Data Entries: 72 - Data Format: mp4 - Recording Time: Approximately 1 minute per video - Recording Environment: - Angle: Bust-up (frontal) assuming an online interview - Variation: Free-talk format, and reading format of a specified script - Others: Gender and "script present/absent" flags provided as meta-information in a list format. - Sample Page: https://qleandataset.visual-bank.co.jp/lineup/ds-048

Use Case Images for the "Japanese New Graduate Self-PR Video Dataset" ### [Research Applications] - Construction of non-verbal communication analysis models Can be used for research on multimodal analysis analyzing how psychological states like tension and confidence affect facial expressions, gaze movement, and speech pitch in evaluative situations such as job hunting.

### [Industrial Applications] - Development of video screening support algorithms in HR Tech Can be used to train feature extraction models that transcribe candidates' speech content and index facial brightness and gaze fixation for AI-powered video interview screening functions. - Development of Text-to-Speech (TTS) and voice conversion models in specific situations Leveraging the tense speaking environment of self-PR, it can be utilized as base data for training voice generation AI to reproduce specific emotions or tension levels, or for additional recordings specialized in specific tones. - Verification of virtual backgrounds and lighting correction for web conferencing systems Can be used to evaluate the accuracy of image quality improvement algorithms that naturally correct skin texture and extract human outlines (segmentation) in bust-up compositions unique to online interviews.

About "Qlean Dataset" "Qlean Dataset" is a commercially viable AI training data solution provided by amanaimages inc., a subsidiary of Visual Bank.

It supports various data formats, including images, video, audio, 3D, and text, creating an environment that can be safely used for both research and commercial purposes. In addition, through collaborations with domestic and overseas data holders and media such as radio, newspaper, and news agencies, we are continuously expanding the "AI Data Recipe" data lineup tailored to industry-specific and latest trends.

Qlean Dataset reduces the burden of data collection and preparation in AI development environments and supports the construction of AI development environments with cleared rights and no legal risks.

Qlean Dataset Site: https://qleandataset.visual-bank.co.jp/ AI Data Recipe: https://qleandataset.visual-bank.co.jp/lineup

Features of the "AI Data Recipe" datasets provided by "Qlean Dataset": - Consent obtained from all subjects - Existing data can be delivered in as little as one day - Supports original data construction through custom shooting, recording, and collection

### Visual Bank Inc. As a startup company building and providing next-generation data infrastructure to maximize AI development capabilities, it develops its business with the mission to "unleash the potential of all data." It owns 100% subsidiaries including "THE PEN," an AI auxiliary tool supporting manga artists who "want to draw more!", and amanaimages inc., which provides the AI training dataset development service "Qlean Dataset." Furthermore, Visual Bank has been selected for the national research and development program "GENIAC" and is accelerating efforts toward social implementation.

Representative Director and CEO: Masayuki Nagai Location: C-Cube Minamiaoyama Bldg 6F, 7-1-7 Minamiaoyama, Minato-ku, Tokyo 107-0062 Visual Bank Corporate URL: https://visual-bank.co.jp/ amanaimages Corporate URL: https://amanaimages.com/about/

FACT BOX

Source: PR TIMES
Category: News

Qlean Dataset Launches "Japanese New Graduate Self-PR Video Dataset"

⚡ Key Points

About "Qlean Dataset" "Qlean Dataset" is a commercially viable AI training data solution provided by amanaimages inc., a subsidiary of Visual Bank.

FACT BOX

Editorial & Verification Standards

FAQ

Cite this article — HOW TO CITE

AI CRAWLER ACTIVITY