ZOZO Research Paper Accepted at SIGIR 2026, a Top Conference in Information Retrieval and Recommender Systems
ZOZO Research, the R&D arm of ZOZO NEXT, announced that their paper on an evaluation benchmark for vision-language models in fashion e-commerce was accepted at SIGIR 2026. The study emphasizes the importance of selecting and operating appropriate AI models for practical fashion e-commerce tasks.
📋 Article Processing Timeline
- 📰 Published: May 20, 2026 at 21:00
- 🔍 Collected: May 20, 2026 at 12:31
- 🤖 AI Analyzed: May 20, 2026 at 12:41 (9 min after Collected)
ZOZO Research, the research and development organization of ZOZO NEXT (HQ: Chiba, CEO: Kotaro Sawada), announced that a paper titled "Preliminary Study of an Evaluation Benchmark for Vision–Language Models in Fashion E-Commerce," authored by their researchers, has been accepted into the Industry Track of SIGIR 2026, a top-tier conference in the field of information retrieval and recommender systems. The research was conducted by a group including Ryotaro Shimizu and Towncan Sai of ZOZO Research, and Shion Sakurai and Yuki Shimizu of ZOZO Inc.
In recent years, Vision-Language Models (VLMs) capable of handling both images and text have evolved rapidly, with expectations for their use in organizing and searching for product information in fashion e-commerce. However, AI models have different strengths and weaknesses depending on the task. Selecting the optimal model for each use case is essential, requiring evaluation benchmarks to assess performance. Traditional benchmarks have struggled to evaluate fashion-specific details—such as colors, materials, and styles in images—as well as the accuracy required for practical business operations.
This study verifies how VLMs should be evaluated and selected for practical use in fashion e-commerce. Researchers defined five specific tasks: tagging, color determination, image quality checking, seasonal determination, and material determination, divided into coordination images and product images. Furthermore, they introduced a mechanism to measure AI model stability and compared multiple models. The results revealed that the optimal model differs by task and that high-performance models are not always optimal. Continuous evaluation and selection tailored to specific applications are crucial.
This research constructs an evaluation benchmark for practical tasks in fashion e-commerce and highlights the importance of model selection. Moving forward, the team will strengthen collaboration with ZOZO's business divisions and enhance the evaluation foundation. They will continue to improve the accuracy and coverage of evaluation methods and expand their application to new tasks, fostering a system that supports the implementation and operation of AI models.
FAQ
What problem does this research solve?
It enables evaluation of AI performance in fashion-specific tasks like material and style recognition, which are difficult with traditional benchmarks, allowing for better model selection.
Why is model selection crucial?
AI models vary in capability depending on the task, and high-performance models aren't always optimal for practical use, making continuous evaluation essential.
What is SIGIR?
It is a world-leading top-tier conference in the field of information retrieval and recommender systems.