Senior Data & AI Engineer
Data Architecture at Scale. AI Engineering in Practice.
17 years of enterprise delivery across Azure, Kafka, Spark, and LangChain — now building the next generation of AI-native data systems.
About Me
I'm a Senior Data & AI Engineer with 17 years of enterprise delivery experience. My career spans data architecture, real-time streaming, and AI engineering — working with organisations across financial services, retail, and technology.
Today I focus on the intersection of enterprise data platforms and modern AI: building MCP servers, LangChain pipelines, and Claude Code workflows that make data teams dramatically more productive. The best AI systems are grounded in solid data engineering fundamentals.
Based in Arizona, US. Available for senior consulting and AI engineering leadership roles globally.
2008
Started the Data Engineering journey
2012
Designed first enterprise Data Warehouse
2015
First big data platform — Hadoop & Spark at scale
2017
First Kafka cluster and real-time pipeline in production
2026
Building AI-native data systems with LLMs & MCP
Skills
Data Engineering
- Azure Data Factory
- Azure Synapse
- Databricks
- dbt
- Kafka
- Spark
- Talend
- Informatica
- Airflow
AI & LLMs
- LangChain
- MCP
- Claude Code
- Azure OpenAI
- Custom GPTs
- RAG Pipelines
Cloud & DevOps
- Microsoft Azure
- Vercel
- Docker
- GitHub Actions
- Terraform
BI & Analytics
- Power BI
- Tableau
- MicroStrategy
- Azure Synapse
- DAX
- SQL
Flagship Projects
Selected enterprise deliveries with measurable business impact.
Finance RAG — Ask My 10-Ks
A production-grade retrieval-augmented generation system for querying SEC filings. Pose natural-language questions against 10-K and 10-Q disclosures and receive cited, structured answers wherein every claim is traceable to a specific filing page. No hallucinated figures. No guesswork.
- ~2.1s median latency
- ~$0.02 per query
- ≥0.85 RAGAS faithfulness
- 1,260 chunks across 12 SEC filings
- 0–1 blended confidence score per answer
- RAGAS eval gate on every pull request
- RAG
- LLM
- FAISS
- OpenAI
- FastAPI
- Python
- SEC EDGAR
- Cohere
- Langfuse
- LangSmith
- LangChain
- RAGAS
- JWT
- Pydantic
Real-Time Enterprise Data Platform
End-to-end streaming platform on Apache Kafka and Azure Event Hubs, processing 50M+ events per day across distributed microservices with sub-100ms end-to-end latency.
- 50M+ events processed daily
- Sub-100ms p99 latency
- 12 source systems integrated
- Apache Kafka
- Azure Event Hubs
- Apache Spark
- Delta Lake
- dbt
- Azure
GitHub
Pinned repositories and open source work.
Get in Touch
Interested in working together? Send me a message and I'll get back to you within one business day.