About

About Me

I am a Machine Learning Engineer at Chubb, where I design, optimize, and deploy large-scale LLM-powered systems in production. I graduated from BITS Pilani, Pilani Campus with a Bachelor’s degree in Electrical and Electronics Engineering.

My work focuses on large language models, agentic AI systems, and high-performance inference. I have built end-to-end fine-tuning pipelines for LLaMA-3.1 (70B) using PEFT techniques such as LoRA and QLoRA, combined with RAG over internal domain data. These systems improved task accuracy by 25%, reduced production drift by 15%, and reliably serve over 10,000 daily requests under peak load.

I specialize in scalable inference and deployment, using vLLM on Kubernetes (AKS) across A100 and H100 GPUs. Through KV-cache and GPU memory optimizations, I reduced p95 latency by 40% and increased throughput by 50%, while maintaining strict SLA guarantees. My work also includes production-grade APIs, CI/CD pipelines, monitoring, and automated scaling for ML systems.

Previously, I was an NLP Research Intern at Nanyang Technological University, where I worked on code-switching language models and published at ACIIDS and IALP. My research has also been accepted at venues such as ACL ARR and AACL-IJCNLP. I enjoy building systems that combine strong theoretical grounding with real-world impact, spanning agentic hiring copilots, applied NLP research, and large-scale AI infrastructure.

Experience

Professional Experience & Education

July 2023 - Present

Machine Learning Engineer

Chubb Engineering Center India, Hyderabad

• Fine-tuning at scale (LLaMA-3.1 70B): Built an end-to-end pipeline with PEFT (LoRA/QLoRA) + RAG over internal domain data, improving task accuracy on internal benchmarks by 25% and reducing production drift by 15% over the evaluation window; governed by offline holdout tests and progressive traffic gating.
• Agentic AI (orchestration & planning): Integrated a multi-agent workflow directly into the same application. A planner/router decomposes user intents into sub-goals, selects tools (internal search, scraping, structured DB lookups) based on a question taxonomy, and emits step-by-step CoT plans to guide execution.
• Agentic AI (evidence retrieval & verification): Implemented retrieval/scrape agents for structured and unstructured sources with source-level citation tracking, plus a verification agent that runs CoT-based cross-checks. This improved factual grounding vs. a single-agent baseline by 18% while keeping response times within the application SLA through caching and bounded tool-use.
• High-performance inference & deployment: Productionized with vLLM on AKS across A100/H100 nodes; GPU memory/KV cache optimizations cut p95 latency by 40% and lifted throughput by 50%, reliably serving 10K+ daily requests under peak load.
• Integrated robust CI/CD processes, monitoring, and automated scaling strategies to ensure continuous model improvement and reliable production deployments across cloud-based environments.
• Architected scalable data pipelines: Implemented robust data processing solutions using SQL Server, Azure Databricks, and PySpark, cutting processing times by 30% and significantly enhancing overall system performance.
• Integrated CI/CD, monitoring, and automated scaling: Established end-to-end processes that ensured continuous model improvement and reliable production deployments across cloud-based environments.

June 2022 - June 2023

NLP Research Intern

Speech Lab
Nanyang Technological University, Singapore

• Developed a machine learning language model tailored for English-Malay code-switched data, achieving a 20% improvement in accuracy over baseline models by implementing advanced statistical and neural augmentation techniques.
• Integrated linguistically informed algorithms—including part-of-speech tagging and grammatical coherence—to enhance multilingual NLP robustness and advance the state-of-the-art in code-switching language processing.
• Enhanced the model's ability to handle diverse linguistic patterns, advancing the state-of-the-art in code-switching language processing.
• Contributed to the development of bilingual communication technologies by applying cutting-edge machine learning techniques for code-switching scenarios.

2019-2023

Bachelors of Engineering

Electrical and Electronics Engineering
BITS Pilani, Pilani Campus

Resume

Publications

WhoDunit: Evaluation benchmark for culprit detection in mystery stories

ACL ARR December 2024

Arxiv

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

NLP4DH, AACL-IJCNLP 2022

Taiwan

Adapting Code-Switching Language Models with Statistical-Based Text Augmentation

ACIIDS 2023

Thailand

Singaporean Conversational English-Malay Code-Switching Points

IALP 2023

Singapore

Data Augmentation for Automated Essay Scoring using Transformer Models

AISC 2023

India

Assemble AI

Open Source LLM Tools Initiative

Assemble AI is my open-source initiative dedicated to leveraging AI for real-world problems. Through this platform, I create innovative LLM-powered tools that demonstrate the practical applications of modern AI technologies, from personalized content generation to intelligent data processing.

SmartChef

AI-Driven Personalized Recipe Generator

SmartChef is an intelligent recipe generation system that utilizes GPT and advanced NLP techniques to craft custom recipes based on available ingredients, cuisine preferences, dietary restrictions, and nutritional needs.

Key Features:

Personalized recipe recommendations using advanced AI algorithms
Dietary restriction and nutritional optimization
Multi-cuisine support with cultural authenticity
Ingredient substitution suggestions
Scalable deployment via Docker and cloud infrastructure

Technologies: GPT, OpenAI API, Python, NLP, Docker, AWS

View Project

HawkHire

AI Hiring Copilot

HawkHire is an end-to-end AI-powered hiring copilot that assists recruiters and interview panels with resume normalization, explainable job description matching, and evidence-backed interview analysis. It combines multi-agent orchestration with structured reasoning to deliver auditable, high-confidence hiring decisions.

Key Capabilities:

Explainable JD matching using weighted skills, recency, seniority, and domain fit
Resume parsing and normalization across formats with structured skill extraction
Evidence-linked gap analysis highlighting missing or weak competencies
Interview transcript intelligence using RAG with citation-backed scoring
Multi-agent planner, retriever, and verifier for consistent panel evaluations

Impact:

Faster shortlisting with higher reviewer agreement
Auditable, evidence-backed evaluations across interview panels
Reduced bias through transparent and explainable scoring

Technologies: GPT, OpenAI API, Python, Multi-Agent Systems, RAG, NLP, Data Processing

Interested in collaborating on LLM tools or learning more about these projects?

Explore on GitHub

Skills

Technical Expertise

Programming Languages

Python, Java, C, C++, C#, JavaScript

AI/ML Technologies

PyTorch, TensorFlow, Transformers, Hugging Face, vLLM, Keras

Cloud & Infrastructure

AWS, Azure, Docker, Kubernetes, Git, GitHub

Frameworks & Tools

Flask, Spring Boot, Databricks, Maven, LaTeX

Databases

SQL, RDS

Specializations

Large Language Models, NLP, Computer Vision, MLOps

I'm Kshitij Gupta

Machine Learning Engineer

About

About Me

Experience

Professional Experience & Education

Machine Learning Engineer

NLP Research Intern

Bachelors of Engineering

Publications

Publications

ACL ARR December 2024

NLP4DH, AACL-IJCNLP 2022

ACIIDS 2023

IALP 2023

AISC 2023

Assemble AI

Open Source LLM Tools Initiative

SmartChef

HawkHire

Skills

Technical Expertise

Programming Languages

AI/ML Technologies

Cloud & Infrastructure

Frameworks & Tools

Databases

Specializations

Interests

My Interests

Natural Language Processing

Computer Vision

Big Data Analysis

Probabilistic Machine Learning

Transfer Learning

Deep Learning

Supervised Learning

Unsupervised Learning

Reinforcement Learning

Deductive Inference

Projects

My Projects