Koushik Vasa

5+ years building production-grade agentic systems end-to-end — RAG pipelines, multi-agent orchestration, and inference optimization across healthcare and B2B SaaS.

0
Years Experience
0
Projects Shipped
0
Certifications
0
Companies
OPEN TO WORK   •   📍 Hauppauge, NY   •   ✓ OPT Work Authorized
Koushik Vasa

Building AI that works in production

I build AI systems that don’t just demo well — they ship, scale, and hold up in production.

With 5+ years across Generative AI, RAG pipelines, and multi-agent systems, I’ve worked as the sole AI engineer at Bridge AI and previously at Capgemini and CitiusTech, spanning B2B SaaS and healthcare.

I’m comfortable across the full stack from embeddings and FastAPI to React and prompt orchestration, and I care deeply about latency, reliability, and systems that actually move the needle.

Claude API LangGraph DSPy RAG FastAPI Azure OpenAI pgvector Docker

Education

George Mason University
Master of Science in Computer Science (Machine Learning)
Jan 2024 – Dec 2025
GPA: 3.87
SRM University AP, India
Bachelor of Technology in Computer Science and Engineering, Machine Learning
Jun 2019 – May 2023
GPA: 3.6

Work Experience

AI/ML Engineer
Bridge AI · USA
Aug 2025 – Present
  • Built Bridgette, an agentic AI assistant for a B2B marketplace, replacing a 10+ step workflow (search → filter → invite → negotiate) with a single conversational interface.
  • Designed a multi-step tool-calling agent using Claude (Anthropic API) and DSPy for structured prompt optimization and execution control.
  • Designed agent decision framework enabling autonomous execution across retrieval, structured queries, and response generation.
  • Architected production dual-agent system (companies vs. experts) supporting workflows such as expert matching, engagement creation, and earnings analysis via natural language.
  • Developed 8 production-grade tool schemas with strict runtime validation (Zod), ensuring reliable execution across agent decision and tool-calling layers.
  • Improved end-to-end agent task success rate from ~62% to ~84% by identifying and fixing gaps across retrieval, reasoning, and execution layers.
  • Reduced hallucinated or incomplete responses by ~25% through validation layers and fallback mechanisms, evaluated across ~500+ test queries.
  • Reduced average response time from ~1.2s to sub-400ms via prompt optimization and query-level caching.
  • Sole AI engineer — owned end-to-end system design covering agent architecture, tool orchestration, backend APIs, and frontend streaming integration.
AI Engineer
Capgemini · India
Apr 2021 – Dec 2023
  • Built document intelligence platform using hybrid RAG (Azure OpenAI + BM25), replacing manual document review for ~120 internal users across multiple teams.
  • Improved retrieval accuracy for enterprise document queries by ~32% by combining dense embeddings with BM25 re-ranking across 500K+ unstructured documents.
  • Designed LangGraph-based pipelines with clear separation of retrieval, reasoning, and response generation, improving response reliability by ~35%.
  • Developed FastAPI inference services handling 200+ concurrent requests with consistent sub-second response times.
  • Introduced confidence-based validation and context filtering that reduced hallucinated or irrelevant responses by ~27% based on internal evaluation benchmarks.
  • Cut inference latency by ~40% through prompt compression, caching, and retrieval pre-filtering.
  • Containerized services using Docker and implemented CI/CD pipelines, reducing deployment time from hours to ~15 minutes.
Machine Learning Engineer
CitiusTech Healthcare Technology Pvt. Ltd. · India
Aug 2019 – Mar 2021
  • Built Python/SQL pipelines processing 1M+ patient and claims records, reducing data preparation time by ~30% for downstream ML workflows.
  • Developed 30-day hospital readmission prediction models using XGBoost and Random Forest, achieving ~82% AUC and enabling early identification of high-risk patients.
  • Applied PCA and domain-driven feature engineering on clinical and claims data, improving model F1-score by ~12% and reducing feature dimensionality by ~40%.
  • Resolved recurring pipeline failures including data inconsistencies and job timeouts, reducing failure rates by ~35% while consistently meeting SLA timelines.

Featured Projects

Clinical AI
Sonus

AI-powered clinical voice intake for hospitals. Conducts structured pre-consultation interviews with patients, capturing symptoms and clinical context, then generates formatted SOAP notes ready for physicians.

Next.js 16 TypeScript Tailwind CSS Vapi Claude Sonnet Tavily
Generative AI
ClearCare

AI Medicare Cost Navigator with a LangGraph 8-node multi-agent pipeline mapping patient symptoms to procedures and calculating real out-of-pocket costs across 2.8M+ CMS NPI providers.

Next.js 14 FastAPI LangGraph GPT-4o ElevenLabs Supabase
AI Platform
MediConnect

Full-stack AI platform matching patients to 2.8M+ providers in under 500ms using custom stream processing over a 704MB dataset, with 3D anatomy explorer and geolocation-based specialist ranking.

React FastAPI Gemini 2.5 Flash SQLite Supabase Three.js
RAG System
CitationSleuth

Dual-layer LLM validation combining semantic retrieval and Neo4j graph traversal to detect hallucinations across 1,000+ HaluEval benchmark samples, with real-time Streamlit interface enforcing 0.7+ similarity thresholds.

Python Neo4j HuggingFace Streamlit Semantic Similarity
Machine Learning
Predicting Hospital Readmissions

ML classification pipeline predicting 30-day hospital readmissions for diabetic patients trained on 100,000+ records, applying PCA for dimensionality reduction with measurable evaluation metrics.

Python Scikit-learn XGBoost Random Forest PCA
Machine Learning
Car Temperature Analysis

Comparative ML pipeline evaluating KNN, Decision Tree, and Random Forest to forecast in-car temperature, achieving R² of 0.9997 with Random Forest using thermal imaging and augmentation for a hardware-free system.

Python KNN Random Forest Decision Tree Thermal Imaging
Web Development
APSWC Government Portal

10+ dynamic UI pages for the Andhra Pradesh State Warehousing Corporation government portal achieving 25% faster page load speeds, full mobile responsiveness, and improved backend data access.

Angular Bootstrap HTML5 CSS3 SQL
Web Application
Travel Advisor

Full-stack travel planning app integrating Google Maps, OpenWeather, and TripAdvisor APIs using React.js, handling 500+ monthly requests with reliable concurrent data fetching and client-side state management.

React.js Node.js Google Maps API OpenWeather API
UI/UX Design
Urban Pool

Complete UI/UX design and prototype for a modern carpooling application, focusing on intuitive user flows, scalable component architecture, and a clean mobile-first visual interface.

Figma

Technical Skills

Generative AI & LLMs
Agentic AI RAG (Hybrid/Re-Ranking) Embeddings Semantic Search Prompt Engineering Tool Calling Agent Orchestration
LLM Frameworks & Platforms
LangChain LangGraph DSPy Azure OpenAI Anthropic Claude Hugging Face OpenAI API
ML & Data Science
PyTorch TensorFlow Scikit-learn NLP Feature Engineering XGBoost Random Forest BERT PCA
Cloud & Infrastructure
Azure ML Azure AI Search Azure Databricks AWS S3 AWS Lambda Docker Kubernetes CI/CD GitHub Actions Vercel
Backend & Data
FastAPI Flask Supabase PostgreSQL pgvector RBAC OAuth MongoDB SQL Node.js
Frontend
React Next.js Vue.js Angular TypeScript HTML5 CSS3 Tailwind CSS Framer Motion
Languages
Python TypeScript SQL Bash JavaScript
Design & Other
Figma UX Design Agile Leadership Data Structures

Certifications

Let’s Connect

Let’s build something remarkable

Open to full-time roles in Generative AI Engineering, LLM Systems, and Agentic AI.
Based in Hauppauge, NY · OPT Work Authorized