
Senior Data Engineer
Anthony Sottile
Building enterprise data platforms, AI-powered analytics, and interactive experiences that turn complex data into decisive action.
About Me

I'm a data engineer who loves building things end-to-end — from designing lakehouse architectures that wrangle millions of rows, to crafting the React dashboards executives actually enjoy using. My sweet spot is turning messy, siloed data into governed platforms that drive real business decisions.
Before enterprise data, I built a baseball analytics SaaS and led analytics for USC Baseball. I still bring that same competitive, scrappy energy to every pipeline and dashboard I ship.
Data Sources Integrated
Executive Dashboards
Opportunities Surfaced
Pipeline Performance Gain
University of Southern California
B.S. Business Administration / Finance · Minor in Applied Analytics · GPA: 3.53
Dean's List · Cum Laude · USC Men's Basketball · USC Baseball
Technical Skills
Click a skill to see where it was used
Languages
Data Engineering
Cloud & DevOps
AI & ML
Visualization
Experience
TIDI Products – The Jordan Company
- Led enterprise BI strategy, architecting Microsoft Fabric as the company's analytics platform with a medallion data architecture integrating 15+ sources into a governed foundation supporting 30+ executive dashboards.
- Architected the company's first custom BI portal (Node.js, React, GraphQL) and MCP server with OpenAI integration, enabling executives to access dashboards, KPIs, and natural-language lakehouse queries through a unified interface.
- Built the company's first enterprise lakehouse in Microsoft Fabric from scratch, establishing standardized ingestion and transformation patterns across 15+ legacy systems.
- Developed PySpark/Spark SQL ELT pipelines that reduced legacy data refresh from 6+ hours to under 45 minutes.
KPMG LLP
- Led development of a digital command center for a $20B utility company; integrated SAP business warehouses into PowerBI delivering C-suite dashboards.
- Built Core Schedules, an automated M&A due diligence product using PowerBI REST APIs and Databricks—cutting report generation from hours to minutes.
CloudHack.AI
- Founded a baseball analytics SaaS for 2 D1 college teams; delivered AI-powered scouting reports via Azure, Databricks, and PowerBI that contributed to a 34-23-1 record.
- Architected end-to-end data infrastructure: automated PySpark ETL from FTP/NCAA sources to Azure Blob Storage, integrated OpenAI API for personalized coaching insights.
USC Baseball
- Led student analytics team delivering PowerBI scouting reports from Trackman data for 55 games across 4 months.
- Developed ad-hoc insights for coaches while supporting on-field operations including batting practice and pitcher sessions.
Data Pipeline Architecture
Interactive medallion lakehouse architecture built at TIDI Products
Baseball Analytics Dashboard
AI-powered scouting reports built for D1 college teams at CloudHack.AI
Pitch Velocity Distribution
Batting Average Trend
Player Skill Profile
Strike Zone Heatmap
Star Schema Explorer
Conformed dimensional models for financial and commercial analytics
Side Projects
Personal projects built on nights and weekends
ClawBot Deploy
One-click OpenClaw deployment platform. Deploy your personal AI assistant to Azure in minutes with a premium SaaS dashboard.
CloudHack Bets
AI-powered betting analytics platform with real-time arbitrage detection, Kelly Criterion sizing, and an LLM mathematician chat.
CloudHack Lead App
AI-driven outreach CRM with autonomous agents that handle prospecting, follow-ups, and response classification via a Kanban pipeline.
Ask My AI Assistant
Powered by OpenAI — ask anything about my experience, skills, or projects
Try one of these to get started:
Let's Connect
I'm always open to discussing data engineering, analytics, or new opportunities.
Download Resume