What I actually do

I take ambiguous business problems and build production-grade AI solutions: from data ingestion, to models, to APIs and user-facing apps. I focus on reliability, observability and clear product value.

System Design & Architecture

  • Design end-to-end RAG platforms with retrievers, rerankers, and efficient caching.
  • Microservices-based APIs (FastAPI / Gin) with strong contracts and observability.
  • Production orchestration: Docker, Kubernetes, CI/CD and autoscaling patterns.

Modeling & Evaluation

  • Fine-tuning transformers, prompt engineering, embeddings and retrieval tuning.
  • Designing evaluation pipelines: precision@k, recall, MRR and human-in-the-loop checks.
  • Optimization for latency and cost: quantization, caching, batched inference.

Data & Pipelines

  • Scalable data ingestion (web pages, PDFs, SharePoint), preprocessing and updater jobs.
  • Vectorization, embedding pipelines and vector DB lifecycle (FAISS, Milvus, Qdrant).
  • Monitoring, alerting and data quality checks for production ML.

Delivery & Impact

  • End-to-end ownership: prototype → production → monitoring → iteration.
  • Mentoring engineers, performing architecture reviews, and raising engineering standards.
  • Building products with measurable outcomes (reduced call-hold time, operational savings).

Technical skills

Languages, frameworks and tooling I use daily.

Python Go SQL JavaScript / TypeScript FastAPI React Docker Kubernetes PyTorch Transformers RAG FAISS / Milvus Azure AI / AWS

Contact

Email me at contact@arungautham.dev

Email