Shubham Jain

I work as the Director, AI and Engineering at Avataar, where I have spent the past five years building products in areas of generative models, 3D vision, neural rendering, and large-scale ML systems.

Avataar is a Series B Computer Vision startup and has raised $55 million capital from investors like Tiger Global and Sequoia Capital. Avataar works with global entreprise clients including Lowe's, Home Depot, Amazon, Samsung, HP, Victoria's Secret to name a few.

Email  /  Linkedin  /  Twitter  /  Github

profile photo

Key Products & Research:

At Avataar, I have built end-to-end vision ML systems: NeRF based 3D reconstruction, neural rendering on low-end devices, Video diffusion models training and distillation, optimized inference, and backend systems for large-scale serving. Below are the products that I have built, led and shipped.

Velocity: Generative videos for shopping

[Product Link] [TechCrunch]

Video Generative models for fashion videos. Diffusion-transformer model trained to ensure face, garments consistency and camera motion.

Responsive auto-layouting javascript library. Plugged with VLMs and python automation (agents?) to automatically create videos from product pages.

Slate: SOTA Path Tracer for Web

[Demo link] [HP live]

WebGPU and Rust based neural path-tracing renderer for accurate and efficient physically based rendering. Supports both web and cross-platform native rendering with the same codebase. The renderer uses various advanced neural acceleration and material techniques.

Thanks to WebGPU, we now have compute shaders support for web. The renderer delivers high-end product ads in fraction of costs for clients.

Incarnate: NeRF Reconstruction and Mobile Rendering

[US Patent] [AWE Launch Coverage] [Khronos Coverage]

Built over the academia NeRF research to handle real world noisy inputs and poses, real-time NeRF rendering on mobile browsers, NeRF in AR (web), editable mesh reconstruction etc. Used by organizations like Crate&Barrel, Lowe’s etc. for their products.

Granted a US Patent for NeRF mobile rendering pipeline.

Creator: 3D creation engine for the web

[Product link] [Home Depot live] [Samsung live]

3d product storytelling engine for web, built over threejs and WebGL. The experiences built over this engine has generated over 100M views.

Open Source Projects

  • Tiny Specialized Models: 135M - 1.5B parameter small models that perform close to large foundation models in specialized reasoning tasks, in this case calculus (differentiation).

  • iNLTK: Trained Language Models for Indian Languages in 2020 - before they were cool. Contributed to iNLTK repo with a LM for Telugu language.

  • vDiff: Inference Engine for Diffusion Models. Will be open-soucing soon.

Invited Talks

  • AWS GenAI Loft 2024: Infra for Vision models training and inference.

  • Samsung SPS Symposium 2024: 3D Vision and Neural Rendering