Senior AI Engineer/Tech Lead · Amsterdam, NL

Agentic systems, shipped to production.

Senior AI Engineer/ Tech Lead/ Agentic Systems & LLM Applications

9+ years building production Python systems — now shipping agentic AI products end-to-end. Model-agnostic across Claude, Gemini, GPT & Bedrock, designing agent architectures with the Claude Agent SDK, Vertex AI, LangChain & MCP: context engineering, tool use, orchestration, evals, guardrails and cost control.

0
Years shipping
0
EU markets
M+
Users reached
0
Services mapped
Scroll to fly through the city
Yash Gupta
Claude Agent SDK
Vertex AI · MCP
[ 01 ] — PROFILE

From prototype
to production.

// model-agnostic
// picks per task on cost,
latency & capability

I take AI features from a notebook to millions of users — across 7 European markets, with the architecture and reliability owned end-to-end.

My focus is the hard 80% that comes after a working prompt: agent orchestration, context engineering, evaluation against gold sets, guardrails and prompt-injection defenses, granular tracing, token/cost accounting, hard cost ceilings, and human-in-the-loop review. Building the model is a solved problem — making it dependable in a low-latency production pipeline is where the value is.

I'm model-agnostic by design: Anthropic Claude, Google Gemini on Vertex AI, OpenAI GPT, and AWS Bedrock-hosted Llama / Mistral — chosen per task on cost, latency and capability. I also lead: HLD/LLD ownership, engineering standards, mentoring, and stakeholder alignment across cross-functional teams.

[ 02 ] — SELECTED WORK

Things I've shipped.

// agents, plugins
& products that run
in the real world

001 Autonomous Agent

Backoffice Agent

Claude Agent SDK · AWS Bedrock · FastAPI · Next.js

An autonomous Jira-ticket-to-PR agent. It plans, implements in an isolated git worktree, verifies against per-repo tests / mypy / linters, pushes a branch, opens the PR, and comments back on the ticket — inference on AWS Bedrock.

  • 5-stage workflow with human gates at Plan / Diff / Push — control without micromanagement.
  • Hard per-run cost ceilings enforced in code: $1–25 budget, 100-turn cap.
  • Shrinks a 2–4 hour ticket loop to ~10 minutes of review.
Claude Agent SDKAWS BedrockFastAPI Next.jsPythongit worktreescost guardrails
Enterprise · Infosys — internal repo
002 Claude Plugin

Backoffice Brain

Claude plugin · knowledge tree · MCP · Python

A Claude plugin that packages a ~254-service enterprise back-office codebase into 12 invokable skills — plan-ticket, find-service, implement, verify, prepare-pr… — backed by a ~280-file knowledge tree with per-service leaves.

  • Skills cite source by file:line and refuse to invent facts when context is missing.
  • Knowledge-graph retrieval so engineers drive ticket work end-to-end without re-reading the codebase each session.
  • Turns tribal knowledge of a sprawling platform into a queryable, trustworthy assistant.
Claude pluginMCPknowledge graph RAGPythongrounded citations
Enterprise · Infosys — internal repo
003 GenAI · Google Cloud

Collateral Engine

Vertex AI Gemini · Cloud Run · FastAPI · Firestore · Pub/Sub

A production GenAI service live on Google Cloud that turns two companies’ PDFs into factually-grounded, layout-ready B2B marketing collateral — grounded generation with citations, deployed on Cloud Run with CI/CD.

  • Every factual claim cites a source-tagged fact from the parsed briefs; the writer abstains rather than invents — verified by a claim-level faithfulness check.
  • Word limits & layout enforced by a deterministic validate-and-repair loop in code, not by trusting the model; prompt-injection hard gate refuses poisoned PDFs, fail-closed.
  • Cloud Run · Pub/Sub + DLQ · Firestore · GCS with CMEK · Terraform · Cloud Build CI/CD — 51 offline tests, measured cost ~$0.01 per article.
Vertex AIGeminiCloud Run FirestorePub/SubTerraformCloud BuildFastAPI
View on GitHub
004 Machine Learning · Case Study

Sensor Diagnostics

scikit-learn · semi-supervised · Python

1,600 machine breakdowns, 20 sensors, only 40 expert labels. Proved the raw data has no cluster structure, found the one sensor carrying the failure signal, and propagated labels with honest, leak-free validation.

  • EDA first: silhouette ≈ 0.04 at every k, HDBSCAN flags 100% noise — naïve clustering provably fails; ANOVA isolates the signal to a single sensor (F ≈ 25).
  • Label Spreading on the selected subspace under nested cross-validation — reports the leak-free 0.50, and shows the leaky 0.60 as the trap it is.
  • Calibrated confidence + an “unknown” bucket so novel failure modes route to experts instead of being forced into 3 classes; production design on Vertex AI Pipelines.
scikit-learnLabel Spreadingnested CV permutation testscalibrationVertex AI (design)
Case study · deck & technical report on request
005 Product · PWA

Helios°

React · TypeScript · Vite · Framer Motion · PWA

Precision energy intelligence for residential solar. A mobile-first PWA that turns raw inverter telemetry into insights a homeowner can act on — when to run the dishwasher, when to pre-charge the battery, which string is shading.

  • Live solar → home ↔ battery ↔ grid energy-flow visualization with real wattages.
  • 7-day production forecast from Open-Meteo shortwave-radiation modeling — no API key.
  • SunSpec Modbus telemetry, battery-strategy modes, white-labeling, installable & offline.
React 18TypeScriptTailwind Framer MotionRechartsSunSpec ModbusPWA
View on GitHub
[ 03 ] — CAPABILITIES

The toolkit.

// depth in agentic AI,
backed by 9 years
of backend rigor

// 01

Models — model-agnostic

Claude Opus/Sonnet/HaikuGemini Pro/FlashGPT-4 / 4oBedrockLlamaMistral
// 02

Agentic AI & LLMs

Claude Agent SDKLangChainMCPtool useorchestrationRAGprompt engevalsguardrails
// 03

LLMOps & Observability

tracingtoken/cost accountingcost ceilingsHITL reviewregression testslatency tuning
// 04

Languages & Backend

Python (expert)PerlSQLDjango / DRFFastAPIFlaskNext.jsmicroservicesPytest
// 05

Cloud — AWS & GCP

Cloud RunVertex AIFirestorePub/SubCloud BuildBedrockEC2 / S3 / LambdaDockerKubernetes
// 06

Data & Infrastructure

Apache AirflowKafkaPostgreSQLBigQueryGrafanaPrometheusJenkinsTerraform
// 07

Leadership

tech leadHLD / LLDengineering standardsmentoringstakeholder alignment
// 08

Forward-deployed delivery

embedded with the clientprototype → productiondemos & RFPspilot pitchingenterprise integration
[ 04 ] — EXPERIENCE

The track record.

// Infosys Limited
// 2016 → present

NOV 2021 — PRESENT
Amsterdam, NL

Technology Lead / Senior AI Engineer

Infosys Limited — Amsterdam

  • Lead design & delivery of AI-powered product features across 7 European markets, serving millions of users; own architecture and reliability end-to-end.
  • Architected agentic systems that turn Jira tickets into reviewed Bitbucket PRs, with human gates and hard cost ceilings enforced in code.
  • Built knowledge-graph retrieval over a large enterprise codebase so AI assistants cite source by file:line and refuse to answer when context is missing.
  • Operate Apache Airflow pipelines handling 300–500+ DAG runs/day across batch and real-time AI workloads; productionized low-latency inference APIs.
  • Shipped video-intelligence features (e.g. skip intro/outro) across millions of playback sessions; designed scalable microservices on Docker & Kubernetes.

Python · Claude Agent SDK · LangChain · MCP · AWS Bedrock · FastAPI · Next.js · Airflow · Kafka · Docker · Kubernetes

JAN 2018 — OCT 2021
New Delhi, IN

Senior Systems Engineer

Infosys Limited — New Delhi

  • Built & maintained backend systems and RESTful APIs for data-intensive applications in Python (Django) and Perl (Catalyst).
  • Developed a traffic-simulation framework for resilience and load testing under high network traffic.
  • Modernized and modularized legacy code bases — reducing operating cost and improving long-term maintainability.

Python · Perl · Django · Catalyst · REST · MySQL · Docker · Jenkins · Git

JUN 2016 — JAN 2018
Bengaluru, IN

Systems Engineer

Infosys Limited — Bengaluru

  • Built automation scripts & data pipelines in Python and Shell, reducing manual workload and accelerating analytics workflows.
  • Evaluated technical feasibility of new system designs; proposed performance optimizations adopted into production.

Python · Perl · Shell · Linux · MySQL

[ 05 ] — EDUCATION & RECOGNITION

Foundations.

Degree

B.Tech, Computer Science & Engineering

SRM Institute of Science and Technology, India

2012 — 2016

Recognition

HackerRank — 5★

5-star ratings in Problem Solving and Python

Competitive programming

Let's talk

Have an agent
worth building?

yashgpt2894@gmail.com
LinkedIn ↗ GitHub ↗ Résumé ↓