Senior Data & AI Engineer
Data Architecture at Scale. AI Engineering in Practice.
17 years of enterprise delivery across Azure, Kafka, Spark, and LangChain — now building the next generation of AI-native data systems.
About Me
I'm a Senior Data & AI Engineer with 17 years of enterprise delivery experience. My career spans data architecture, real-time streaming, and AI engineering — working with organisations across financial services, retail, and technology.
Today I focus on the intersection of enterprise data platforms and modern AI: building MCP servers, LangChain pipelines, and Claude Code workflows that make data teams dramatically more productive. The best AI systems are grounded in solid data engineering fundamentals.
Based in Arizona, US. Available for senior consulting and AI engineering leadership roles globally.
2008
Started the Data Engineering journey
2012
Designed first enterprise Data Warehouse
2015
First big data platform — Hadoop & Spark at scale
2017
First Kafka cluster and real-time pipeline in production
2026
Building AI-native data systems with LLMs & MCP
Skills
Data Engineering
- Azure Data Factory
- Azure Synapse
- Databricks
- dbt
- Kafka
- Spark
- Talend
- Informatica
- Airflow
AI & LLMs
- LangChain
- MCP
- Claude Code
- Azure OpenAI
- Custom GPTs
- RAG Pipelines
Cloud & DevOps
- Microsoft Azure
- Vercel
- Docker
- GitHub Actions
- Terraform
BI & Analytics
- Power BI
- Tableau
- MicroStrategy
- Azure Synapse
- DAX
- SQL
Flagship Projects
Selected enterprise deliveries with measurable business impact.
Finance RAG
Production-grade retrieval-augmented generation over SEC filings. Ask plain-English questions about 10-K and 10-Q disclosures and get answers grounded in cited source documents.
- ~2.1s median latency
- ~$0.02 per query
- ≥0.85 RAGAS faithfulness
- 12 SEC filings indexed
- RAG
- LLM
- FAISS
- OpenAI
- FastAPI
- Python
- SEC EDGAR
- Cohere
- Langfuse
Real-Time Enterprise Data Platform
End-to-end streaming platform on Apache Kafka and Azure Event Hubs, processing 50M+ events per day across distributed microservices with sub-100ms end-to-end latency.
- 50M+ events processed daily
- Sub-100ms p99 latency
- 12 source systems integrated
- Apache Kafka
- Azure Event Hubs
- Apache Spark
- Delta Lake
- dbt
- Azure
GitHub
Pinned repositories and open source work.
Get in Touch
Interested in working together? Send me a message and I'll get back to you within one business day.