About Me
I'm an undergraduate at IIT Roorkee (Class of '27) focused on AI/ML, generative models, and applied research. I build things end-to-end: from implementing Stable Diffusion and LLaMA 2 from scratch to shipping production MLOps pipelines with DVC, MLflow, Docker, and AWS.
I've contributed to Hugging Face's nanoVLM repository, interned at Trinity College Dublin on MoE-diffusion architectures, and built controllable image generation pipelines at Predis.ai. I'm most interested in the overlap between rigorous implementation and practical deployment.
Achievement
JEE Advanced 2023
All India Rank 6453 out of approximately 1.5 million candidates.
Education
Indian Institute of Technology, Roorkee
B.Tech in Civil Engineering
August 2023 – May 2027
Pursuing engineering fundamentals while independently building depth in AI/ML—covering generative models, MLOps infrastructure, and applied deep learning alongside coursework.
Achievement
JEE Advanced 2023
All India Rank 6453 out of approximately 1.5 million candidates.
Experience
Artificial Intelligence Research Internship
Trinity College of Dublin — Prof. Subrahmanyam Murala
Oct 2025 – Present · Ireland (Remote)
- Designed diffusion transformers with dynamic token routing mechanism achieving 31+ PSNR.
- Engineered multi-scale feature fusion and progressive decoding improving generalization across conditions.
- Implemented evaluation framework tracking gradient flow, token compression and semantic reconstruction quality.
- Fine-tuned MobileNetV3 encoder for image restoration task with 33M parameters and reasoning capabilities.
Generative AI Internship
Predis.ai
Aug. 2025 · Remote
- Surveyed recent research on controllable image generation to drive design choices for ad-creative systems.
- Evaluated and prototyped Qwen-Image to improve prompt + image conditioning for targeted outputs.
- Designed a conditioned ad-creative pipeline using FLUX and OminiControl to generate advertisement.
- Fine-tuned models with LoRA and developed prompt strategies to increase brand alignment in generated assets.
Projects
Stable Diffusion from Scratch
Complete implementation of Stable Diffusion including VAE encoder/decoder, CLIP text encoder, UNet with cross-attention, and classifier-free guidance. Generated 512×512 images from text prompts via DDPM denoising.
LLaMA 2 from Scratch
Built LLaMA 2 from the ground up with KV-Cache, rotary position embeddings, grouped-query attention, and top-p sampling. Implemented BPE tokenizer and ran zero-shot generation on custom prompts.
GPT from Scratch
Transformer-based language model with multi-head self-attention, feed-forward layers, and autoregressive generation. Trained with AdamW; achieved validation loss of ~1.89.
Transformer Implementation
Full Transformer architecture with token/positional embeddings, multi-head attention, and a complete training pipeline. Integrated TensorBoard logging and automated checkpointing.
Vehicle Insurance Prediction
End-to-end MLOps pipeline for binary classification of customer insurance interest. Integrated MongoDB Atlas for data storage, Docker containerisation, AWS S3/ECR for model registry, and automated CI/CD via GitHub Actions.
Water Potability Prediction
- Structured with Cookiecutter scaffolding; reproducibility via DVC pipelines.
- Trained five models: Decision Tree, Random Forest, SVM, XGBoost, and k-NN.
- Tracked experiments with MLflow + DagsHub; logged confusion matrices per run.
- CI via GitHub Actions; feature importance logging included.
Flight Delay Analysis & Prediction
- Analysed 178,747 flight records; engineered features including operational arrival index, delay rate, and cyclical time encoding.
- XGBoost model: 78% accuracy and 0.86 AUC-ROC on 35,000 validation records.
- 81% recall on actual delays; average delay duration predicted within 5.2 min MAE.
- SHAP analysis identified departure hour, weather, and carrier as the top three delay drivers.
Credit Default Prediction
- Engineered max delinquency, delinquency streak, and credit utilisation from 29,461 customer records.
- Preprocessing: median imputation, deduplication, 60/40 split, SMOTE for class balancing.
- Ensemble models: 73% accuracy and 0.76 AUC-ROC; 67% recall on true defaulters.
- Deployed on 5,017 test records with a tuned 0.465 threshold for high-risk intervention.
Stock Sentiment Analysis
- Python pipeline to ingest, clean, and convert financial news into sentiment features.
- Trained and tuned ML classifiers to predict market direction from news sentiment.
- Backtested with QuantStats on historical price data: 10% annual return, 1.25 Sharpe ratio.
- Scalable backtesting engine processing intraday OHLC data with execution slippage benchmarking.
Technical Skills
Languages & Frameworks
MLOps & Tools
Data Science
Deep Learning
NLP
Computer Vision
Databases
Advanced Concepts
Open-Source
nanoVLM — Removed Dead lm_eos_token_id Parameter
PR #138 · huggingface/nanoVLM
While reading through config.py in Hugging Face's nanoVLM repository, I found a field that was silently wrong in three distinct ways:
- Factually incorrect:
lm_eos_token_id: int = 0hardcoded EOS as token ID 0, but SmolLM2-360M-Instruct actually uses EOS token ID 2. - Dead code: The field was never read. All EOS logic in the codebase was already handled via
self.tokenizer.eos_token_id, so this config entry never executed. - Misleading to contributors: Its presence implied the EOS token was configurable through
config.py, which it wasn't — a subtle trap for anyone extending the model.
I removed the unused field and updated the documentation to reflect that EOS handling is driven exclusively by the tokenizer. This eliminates a source of confusion and reduces the risk of silent misconfiguration in downstream forks.
Get In Touch
Open to research collaborations, internships, and discussions on AI/ML.