Video analysis AI company Asilla develops industry-specific proprietary VLM 'AsillaVision' using over 7 million proprietary security camera video data points
Asilla developed a proprietary VLM, 'AsillaVision-v1-4B', specialized in security camera footage. Trained on over 7 million videos, it achieved an 89% anomaly detection accuracy, outperforming major general-purpose AIs like Gemini.
📋 Article Processing Timeline
- 📰 Published: April 8, 2026 at 19:00
- 🔍 Collected: April 8, 2026 at 10:31
- 🤖 AI Analyzed: April 20, 2026 at 20:15 (297h 43m after Collected)
Asilla, Inc. (Headquarters: Chiyoda-ku, Tokyo; Representative Director and CEO: Tsuyoshi Onoe; hereinafter 'Asilla'), which develops proprietary video analysis AI models and behavior recognition AI, announced the development of 'AsillaVision-v1-4B'. This is a unique Vision Language Model (VLM) specializing in the detection of abnormal behavior in security camera footage, utilizing over 7 million proprietary security camera video data points held by the company.
This model achieved an 89% accuracy rate in identifying abnormal behaviors in real-world environments, such as falls, fights, and skateboard use within facilities. It recorded performance exceeding major VLMs such as Google Gemini 3.1 Pro (84%), Alibaba Qwen3.5-9B (64%), and NVIDIA Nemotron Nano-12B-v2-VL (61%). *Comparison based on the company's proprietary evaluation dataset.
## Background of Development
In recent years, the rapid advancement of VLM (Vision Language Model) technology has accelerated the sophistication of video analysis AI. However, general-purpose VLMs developed by major tech companies like Google, OpenAI, and NVIDIA are trained on large-scale data from the internet and lack specialized knowledge of security camera footage.
Security camera footage exists within closed networks of individual facilities and is rarely published on the internet. This 'wall of security camera data' forms a structural limitation for general-purpose models.
Through 'AI Security asilla' introduced to facilities nationwide, Asilla has continuously accumulated security camera video data (CARD) since 2023. Leveraging its proprietary data, which surpassed a cumulative total of 7 million cases by February 2026, Asilla successfully developed a VLM tailored specifically for the security camera video domain. For data collection and usage, anonymization processing is applied after obtaining consent from the facilities where the system is installed.
## Features of AsillaVision-v1
### 1. Domain-specific performance exceeding large-scale models
Despite being a lightweight model with only 4B (4 billion) parameters, it demonstrates domain-specific performance exceeding major general-purpose models in detecting abnormal behavior in security camera footage. It achieved higher accuracy than other leading VLMs in identifying abnormal behaviors in real-world environments, such as falls, fights, skateboard use, and suspicious behavior on escalators.
*Comparison based on identification performance of falls, fights, and skateboard use within facilities.
*The selection of models for comparison was based on representative VLMs available as of February 2026.
### 2. Edge Computing...
This model achieved an 89% accuracy rate in identifying abnormal behaviors in real-world environments, such as falls, fights, and skateboard use within facilities. It recorded performance exceeding major VLMs such as Google Gemini 3.1 Pro (84%), Alibaba Qwen3.5-9B (64%), and NVIDIA Nemotron Nano-12B-v2-VL (61%). *Comparison based on the company's proprietary evaluation dataset.
## Background of Development
In recent years, the rapid advancement of VLM (Vision Language Model) technology has accelerated the sophistication of video analysis AI. However, general-purpose VLMs developed by major tech companies like Google, OpenAI, and NVIDIA are trained on large-scale data from the internet and lack specialized knowledge of security camera footage.
Security camera footage exists within closed networks of individual facilities and is rarely published on the internet. This 'wall of security camera data' forms a structural limitation for general-purpose models.
Through 'AI Security asilla' introduced to facilities nationwide, Asilla has continuously accumulated security camera video data (CARD) since 2023. Leveraging its proprietary data, which surpassed a cumulative total of 7 million cases by February 2026, Asilla successfully developed a VLM tailored specifically for the security camera video domain. For data collection and usage, anonymization processing is applied after obtaining consent from the facilities where the system is installed.
## Features of AsillaVision-v1
### 1. Domain-specific performance exceeding large-scale models
Despite being a lightweight model with only 4B (4 billion) parameters, it demonstrates domain-specific performance exceeding major general-purpose models in detecting abnormal behavior in security camera footage. It achieved higher accuracy than other leading VLMs in identifying abnormal behaviors in real-world environments, such as falls, fights, skateboard use, and suspicious behavior on escalators.
*Comparison based on identification performance of falls, fights, and skateboard use within facilities.
*The selection of models for comparison was based on representative VLMs available as of February 2026.
### 2. Edge Computing...