Cheaper, faster, and culturally aware, Avataar’s video AI is built for India’s scale
Key takeaways
- India s AI model output has been slow compared to the U.S., Europe, and China.
- The Peak XV-backed startup, which focuses on creating video tools for e-commerce, didn t build Varya from scratch.
- To put that in concrete terms: using an NVIDIA H200 GPU, Varya can generate a 5-second 720p clip in 45 seconds, compared to 1,230 seconds for Wan 2.2.
Why this matters: a development in AI with implications for how people work, create, and decide.
India s AI model output has been slow compared to the U.S., Europe, and China. Only a few startups are releasing models, and most of them are large language models or voice models. To encourage more development, the government launched the India AI Mission, a roughly $1.2 billion initiative that — among other things — gives selected startups access to subsidized GPU compute in exchange for releasing their models publicly. One of the 12 startups selected for the program, Avataar AI, has launched a new video model called Varya that is built to understand local context — such as identifying different festivals, food, and clothing.
The Peak XV-backed startup, which focuses on creating video tools for e-commerce, didn t build Varya from scratch. It started with Wan 2.2, a publicly available video generation model released by Alibaba, and used a technique called distillation — essentially compressing the model s capabilities into a leaner, faster version optimized for Avataar s specific use cases. The result is a model that runs in four steps rather than Wan 2.2 s 50, producing video 10 times faster and at a fraction of the cost.
To put that in concrete terms: using an NVIDIA H200 GPU, Varya can generate a 5-second 720p clip in 45 seconds, compared to 1,230 seconds for Wan 2.2.