I love solving with models, both large and small. In my free time I reverse engineer systems I like.
Indri: UltraFast and realistic TTS model
Lead @ Indrivoice [2024]
Built and open sourced a 124 M-parameter gpt2 based TTS/ASR treating audio as discrete tokens, supporting Hindi/English streaming at realtime CPU performance delivering high quality speech synthesis and recognition in a tiny footprint.
Fraud Detection & Liveness Verification
ML Lead @ Fi [2022]
Blocked sophisticated fraud during onboarding by extracting multi-modal video/audio signals from 8 s liveness videos and training a 1D-conv network to detect spoofing attempts. Resulted in deactivation of > 100k user accounts.
Rank 1 @ Kaggle UltraMNIST
Solo Competitor [2022]
Solved the challenge of detecting tiny objects (few pixels wide) in large (2560×2560) images by combining a detector with classifier and training on synthetic data achieving 99.109 % accuracy and 1st place on the leaderboard.
Pose Estimation on the Edge
ML Lead @ cult.fit [2020]
Deployed an on-device TensorFlow Lite pose-estimation pipeline for real-time workout feedback enabling rep counting, form scoring, and gamified UI without server round-trips. Built 1MB models to do accurate pose detection on complex poses.
Automated Food Recognition
ML Lead @ HealthifyMe [2018]
Solved high dropout from manual food logging by launching a snap-and-log system in two weeks curating an initial dataset, training a lightweight CNN, and automating meal tracking for diverse Indian dishes.
Ecommerce Search
Staff Engineer @ BloomReach [2013]
Automated synonym extraction at scale by mining 100 M+ product descriptions and 30 M+ queries—combining contextual embeddings and supervised filtering to improve e-commerce search relevance. Granted two patents for the algorithms developed during this work.
EyeMouse: Gaze-Based Cursor Control
Co founder @ CHI Labs [2011]
Started a company to build hands free computer control devices. Enabled hands-free cursor control via real-time gaze tracking by modeling head and eye geometry—achieving sub-100 ms latency on modest CPU hardware without heavy CV libraries. Our team won IBM Innovation webcontest 2011 for innovations in computer vision.