|
Shubham Jain
I work as the Director, AI Research at Avataar, where I have spent the past five years building/research in areas of generative models, 3D vision and neural rendering.
Avataar is a Series B Computer Vision startup and has raised $55 million capital from investors like Tiger Global and Sequoia Capital India. Avataar works with global entreprise clients including Lowe's, Home Depot, Amazon, Samsung, HP, Victoria's Secret to name a few. Avataar is also one of the 12 companies in India to get funded by Indian Govt to build Sovereign AI models.
Email /
Linkedin /
Twitter /
Github
|
|
Key Products & Research:
At Avataar, I have primariliy worked on Video Diffusion Models, NeRF based 3D reconstruction and Neural Rendering. Below are the products that I have built, led and shipped.
|
|
|
Varya: Distilled Video Generative Model
[Varya Link]
[TechCrunch]
[Govt Press Release]
[Times of India]
Varya is built on DMD2 distribution matching and adapts it for bidirectional video generation, and introduces role aware exit supervision and decoupled guided field for training. The result: better supervision signals across exits, stronger and stable guidance leading to far better quality v/s other distillation methods. Vbench results show quality compararable with teacher Wan 2.2, while being 27x efficient.
Varya, at launch, is the lowest priced video model at $0.005 per second, and was launched by the IT Secretary of India.
|
|
|
Velocity: Generative videos for shopping
[Product Link]
[TechCrunch]
Video Generative models for fashion videos. Distilled DiTs with faster inference and fine-tuned for face, garments and camera consistency.
Responsive auto-layouting javascript library. Plugged with VLMs and python automation (agents?) to automatically create videos from product pages.
|
|
|
Slate: SOTA Path Tracer for Web
[Demo link]
[HP live]
WebGPU and Rust based neural path-tracing renderer for accurate and efficient physically based rendering. Supports both web and cross-platform native rendering with the same codebase.
The renderer uses various advanced neural acceleration and material techniques.
Thanks to WebGPU, we now have compute shaders support for web. The renderer delivers high-end product ads in fraction of costs for clients.
|
|
|
Incarnate: NeRF Reconstruction and Mobile Rendering
[US Patent]
[AWE Launch Coverage]
[Khronos Coverage]
Built over the academia NeRF research to handle real world noisy inputs and poses, real-time NeRF rendering on mobile browsers, NeRF in AR (web), editable mesh reconstruction etc.
Used by organizations like Crate&Barrel, Lowe’s etc. for their products.
Granted a US Patent for NeRF mobile rendering pipeline.
|
|
|
Creator: 3D creation engine for the web
[Product link]
[Home Depot live]
[Samsung live]
3d product storytelling engine for web, built over threejs and WebGL. The experiences built over this engine has generated over 100M views.
|
Open Source Projects
-
Tiny Specialized Models: 135M - 1.5B parameter small models that perform close to large foundation models in specialized reasoning tasks, in this case calculus (differentiation).
-
iNLTK: Trained Language Models for Indian Languages in 2020 - before they were cool. Contributed to iNLTK repo with a LM for Telugu language.
|
Invited Talks
-
AWS GenAI Loft 2024: Infra for Vision models training and inference.
-
Samsung SPS Symposium 2024: 3D Vision and Neural Rendering
|
|