Skip to main content

Senior Data & AI Engineer

Data Architecture at Scale. AI Engineering in Practice.

17 years of enterprise delivery across Azure, Kafka, Spark, and LangChain — now building the next generation of AI-native data systems.

About Me

I'm a Senior Data & AI Engineer with 17 years of enterprise delivery experience. My career spans data architecture, real-time streaming, and AI engineering — working with organisations across financial services, retail, and technology.

Today I focus on the intersection of enterprise data platforms and modern AI: building MCP servers, LangChain pipelines, and Claude Code workflows that make data teams dramatically more productive. The best AI systems are grounded in solid data engineering fundamentals.

Based in Arizona, US. Available for senior consulting and AI engineering leadership roles globally.

2008

Started the Data Engineering journey

2012

Designed first enterprise Data Warehouse

2015

First big data platform — Hadoop & Spark at scale

2017

First Kafka cluster and real-time pipeline in production

2026

Building AI-native data systems with LLMs & MCP

Skills

Data Engineering

  • Azure Data Factory
  • Azure Synapse
  • Databricks
  • dbt
  • Kafka
  • Spark
  • Talend
  • Informatica
  • Airflow

AI & LLMs

  • LangChain
  • MCP
  • Claude Code
  • Azure OpenAI
  • Custom GPTs
  • RAG Pipelines

Cloud & DevOps

  • Microsoft Azure
  • Vercel
  • Docker
  • GitHub Actions
  • Terraform

BI & Analytics

  • Power BI
  • Tableau
  • MicroStrategy
  • Azure Synapse
  • DAX
  • SQL

Flagship Projects

Selected enterprise deliveries with measurable business impact.

Finance RAG

Production-grade retrieval-augmented generation over SEC filings. Ask plain-English questions about 10-K and 10-Q disclosures and get answers grounded in cited source documents.

  • ~2.1s median latency
  • ~$0.02 per query
  • ≥0.85 RAGAS faithfulness
  • 12 SEC filings indexed
  • RAG
  • LLM
  • FAISS
  • OpenAI
  • FastAPI
  • Python
  • SEC EDGAR
  • Cohere
  • Langfuse
View Details →

Real-Time Enterprise Data Platform

End-to-end streaming platform on Apache Kafka and Azure Event Hubs, processing 50M+ events per day across distributed microservices with sub-100ms end-to-end latency.

  • 50M+ events processed daily
  • Sub-100ms p99 latency
  • 12 source systems integrated
  • Apache Kafka
  • Azure Event Hubs
  • Apache Spark
  • Delta Lake
  • dbt
  • Azure
View Details →

GitHub

Pinned repositories and open source work.

Get in Touch

Interested in working together? Send me a message and I'll get back to you within one business day.