About Me

I'm an undergraduate at IIT Roorkee (Class of '27) focused on AI/ML, generative models, and applied research. I build things end-to-end: from implementing Stable Diffusion and LLaMA 2 from scratch to shipping production MLOps pipelines with DVC, MLflow, Docker, and AWS.

I've contributed to Hugging Face's nanoVLM repository, interned at Trinity College Dublin on MoE-diffusion architectures, and built controllable image generation pipelines at Predis.ai. I'm most interested in the overlap between rigorous implementation and practical deployment.

Achievement

JEE Advanced 2023

All India Rank 6453 out of approximately 1.5 million candidates.

Education

Indian Institute of Technology, Roorkee

B.Tech in Civil Engineering

August 2023 – May 2027

Pursuing engineering fundamentals while independently building depth in AI/ML—covering generative models, MLOps infrastructure, and applied deep learning alongside coursework.

Achievement

JEE Advanced 2023

All India Rank 6453 out of approximately 1.5 million candidates.

Experience

Artificial Intelligence Research Internship

Trinity College of Dublin — Prof. Subrahmanyam Murala

Oct 2025 – Present · Ireland (Remote)

  • Designed diffusion transformers with dynamic token routing mechanism achieving 31+ PSNR.
  • Engineered multi-scale feature fusion and progressive decoding improving generalization across conditions.
  • Implemented evaluation framework tracking gradient flow, token compression and semantic reconstruction quality.
  • Fine-tuned MobileNetV3 encoder for image restoration task with 33M parameters and reasoning capabilities.

Generative AI Internship

Predis.ai

Aug. 2025 · Remote

  • Surveyed recent research on controllable image generation to drive design choices for ad-creative systems.
  • Evaluated and prototyped Qwen-Image to improve prompt + image conditioning for targeted outputs.
  • Designed a conditioned ad-creative pipeline using FLUX and OminiControl to generate advertisement.
  • Fine-tuned models with LoRA and developed prompt strategies to increase brand alignment in generated assets.

Projects

Stable Diffusion from Scratch

Complete implementation of Stable Diffusion including VAE encoder/decoder, CLIP text encoder, UNet with cross-attention, and classifier-free guidance. Generated 512×512 images from text prompts via DDPM denoising.

PyTorch CLIP VAE UNet DDPM

LLaMA 2 from Scratch

Built LLaMA 2 from the ground up with KV-Cache, rotary position embeddings, grouped-query attention, and top-p sampling. Implemented BPE tokenizer and ran zero-shot generation on custom prompts.

PyTorch Transformers BPE KV-Cache RoPE

GPT from Scratch

Transformer-based language model with multi-head self-attention, feed-forward layers, and autoregressive generation. Trained with AdamW; achieved validation loss of ~1.89.

PyTorch Transformers AdamW Self-Attention

Transformer Implementation

Full Transformer architecture with token/positional embeddings, multi-head attention, and a complete training pipeline. Integrated TensorBoard logging and automated checkpointing.

PyTorch Attention TensorBoard

Vehicle Insurance Prediction

End-to-end MLOps pipeline for binary classification of customer insurance interest. Integrated MongoDB Atlas for data storage, Docker containerisation, AWS S3/ECR for model registry, and automated CI/CD via GitHub Actions.

FastAPI MongoDB Docker AWS S3/ECR MLflow DVC CI/CD

Water Potability Prediction

  • Structured with Cookiecutter scaffolding; reproducibility via DVC pipelines.
  • Trained five models: Decision Tree, Random Forest, SVM, XGBoost, and k-NN.
  • Tracked experiments with MLflow + DagsHub; logged confusion matrices per run.
  • CI via GitHub Actions; feature importance logging included.
DVC MLflow DagsHub GitHub Actions XGBoost CI/CD

Flight Delay Analysis & Prediction

  • Analysed 178,747 flight records; engineered features including operational arrival index, delay rate, and cyclical time encoding.
  • XGBoost model: 78% accuracy and 0.86 AUC-ROC on 35,000 validation records.
  • 81% recall on actual delays; average delay duration predicted within 5.2 min MAE.
  • SHAP analysis identified departure hour, weather, and carrier as the top three delay drivers.
XGBoost SHAP Feature Engineering Scikit-learn

Credit Default Prediction

  • Engineered max delinquency, delinquency streak, and credit utilisation from 29,461 customer records.
  • Preprocessing: median imputation, deduplication, 60/40 split, SMOTE for class balancing.
  • Ensemble models: 73% accuracy and 0.76 AUC-ROC; 67% recall on true defaulters.
  • Deployed on 5,017 test records with a tuned 0.465 threshold for high-risk intervention.
XGBoost SMOTE Ensemble Scikit-learn

Stock Sentiment Analysis

  • Python pipeline to ingest, clean, and convert financial news into sentiment features.
  • Trained and tuned ML classifiers to predict market direction from news sentiment.
  • Backtested with QuantStats on historical price data: 10% annual return, 1.25 Sharpe ratio.
  • Scalable backtesting engine processing intraday OHLC data with execution slippage benchmarking.
NLP QuantStats Backtesting Scikit-learn

Technical Skills

Languages & Frameworks

Python C++ SQL PyTorch FastAPI

MLOps & Tools

AWS S3/EC2 Docker MLflow DVC Git/GitHub CI/CD

Data Science

NumPy Pandas Scikit-learn XGBoost LightGBM SHAP

Deep Learning

CNNs RNNs Transformers

NLP

BERT LoRA/PEFT LLMs RAG Hugging Face

Computer Vision

CLIP Vision Transformers Stable Diffusion

Databases

MongoDB MySQL PostgreSQL

Advanced Concepts

Chain of Thought Tool Calling Agentic Workflows MoE

Open-Source

nanoVLM — Removed Dead lm_eos_token_id Parameter

PR #138 · huggingface/nanoVLM

While reading through config.py in Hugging Face's nanoVLM repository, I found a field that was silently wrong in three distinct ways:

  • Factually incorrect: lm_eos_token_id: int = 0 hardcoded EOS as token ID 0, but SmolLM2-360M-Instruct actually uses EOS token ID 2.
  • Dead code: The field was never read. All EOS logic in the codebase was already handled via self.tokenizer.eos_token_id, so this config entry never executed.
  • Misleading to contributors: Its presence implied the EOS token was configurable through config.py, which it wasn't — a subtle trap for anyone extending the model.

I removed the unused field and updated the documentation to reflect that EOS handling is driven exclusively by the tokenizer. This eliminates a source of confusion and reduces the risk of silent misconfiguration in downstream forks.

Hugging Face Vision-Language Models Config Cleanup

Get In Touch

Open to research collaborations, internships, and discussions on AI/ML.