Senior Data Engineer

Anthony Sottile

Building enterprise data platforms, AI-powered analytics, and interactive experiences that turn complex data into decisive action.

About Me

Anthony Sottile

I'm a data engineer who loves building things end-to-end — from designing lakehouse architectures that wrangle millions of rows, to crafting the React dashboards executives actually enjoy using. My sweet spot is turning messy, siloed data into governed platforms that drive real business decisions.

Before enterprise data, I built a baseball analytics SaaS and led analytics for USC Baseball. I still bring that same competitive, scrappy energy to every pipeline and dashboard I ship.

0+

Data Sources Integrated

0+

Executive Dashboards

$0M+

Opportunities Surfaced

0×

Pipeline Performance Gain

University of Southern California

B.S. Business Administration / Finance · Minor in Applied Analytics · GPA: 3.53

Dean's List · Cum Laude · USC Men's Basketball · USC Baseball

Technical Skills

Click a skill to see where it was used

Languages

Data Engineering

Cloud & DevOps

AI & ML

Visualization

PythonSQLTypeScriptPySparkMicrosoft FabricDatabricksDelta LakeReactNode.jsDockerKubernetesPower BIOpenAILangChainGraphQLAzure DevOpsPostgreSQLKafkaPythonSQLTypeScriptPySparkMicrosoft FabricDatabricksDelta LakeReactNode.jsDockerKubernetesPower BIOpenAILangChainGraphQLAzure DevOpsPostgreSQLKafka

Experience

TIDI Products – The Jordan Company

Chicago, IL / Remote
Senior Data EngineerAugust 2025 – Present
  • Led enterprise BI strategy and vendor due diligence, selecting and architecting Microsoft Fabric as the company's analytics platform; designed medallion data architecture integrating 15+ sources (IBM Db2 ERP, Salesforce, EDI) into a governed foundation supporting 30+ executive dashboards.
  • Partner with Finance, Sales, and Supply Chain leadership to transform complex business challenges—including multi-tier distributor/end-user relationships, GPO contract structures, and manufacturing cost allocation—into scalable data products that drive enterprise decision-making.
Data Engineer (DP 600) / Data ScientistApril 2024 – July 2025
  • Led greenfield build of the company's first enterprise data platform from scratch, architecting storage and compute layers, establishing foundational data products and standardized ETL pipelines across 15+ legacy systems (IBM Db2, AS/400, Salesforce).
  • Developed and optimized PySpark/Spark SQL ELT pipelines leveraging distributed computing and Delta Lake (Parquet) storage, reducing legacy data refresh from 6+ hours to under 45 minutes and enabling daily executive reporting previously impossible with batch constraints.

KPMG LLP

Chicago, IL
Deal Advisory & Strategy Analytics AssociateJune 2021 – August 2021 / July 2022 – April 2024
  • Led development of a digital command center for a $20B utility company; integrated SAP business warehouses into PowerBI delivering C-suite dashboards for capital spending, workforce productivity, and project delivery KPIs.
  • Built Core Schedules, an automated M&A due diligence product using PowerBI REST APIs and Databricks—cutting insurance/payments deal report generation from hours to minutes.

CloudHack.AI

Chicago, IL
Founder & Solutions ArchitectJuly 2022 – April 2024
  • Founded CloudHack, a sports analytics SaaS serving 2 D1 college teams; architected data pipelines ingesting player performance and statistical data at scale, delivering predictive insights via Azure, Databricks, and PowerBI that contributed to a 34-23-1 record and our client's first 30-win season since 2015.
  • Architected end-to-end data infrastructure on Databricks and PostgreSQL: built dbt transformation models and PySpark ETL from FTP/NCAA sources to Azure Blob Storage, integrated OpenAI API for personalized coaching insights, and designed canonical datasets powering interactive dashboards.

USC Athletics (Baseball & Basketball)

Los Angeles, CA
Director of Analytics / Student ManagerJanuary 2019 – June 2022
  • Led student analytics team processing high-frequency Trackman player tracking data (velocity, spin rate, positional metrics) for 55 games; built automated data pipelines and self-serve dashboards enabling coaches to analyze player performance in real-time.

Data Pipeline Architecture

Interactive medallion lakehouse architecture built at TIDI Products

Baseball Analytics Dashboard

AI-powered scouting reports built for D1 college teams at CloudHack.AI

Pitch Velocity Distribution

Batting Average Trend

Player Skill Profile

Strike Zone Heatmap

Strike Zone
12
28
15
22
45
35
18
38
20
Low High

3D Pitch Spin Lab

Rendered with React Three Fiber + Three.js. The baseball uses a procedural seam texture. Spin axis and gyro values orient the 3D axis/ring, while spin rate drives rotation speed for each pitch.

CloudHack-style side-by-side spin comparison with interactive 3D movement.

Default View: angled perspective for easiest spin inspection.

Data source: local dummy dataset in spinData.ts. Each pitch includes spinRate (RPM), velocity (mph), spinEfficiency (%), IVB/HB (inches), spinAxis (degrees), and gyroAngle (degrees). Camera toggles switch to default, pitcher, or hitter perspective.
Pitch A
Pitch B
Pitch A | Sinker2236.7 RPM

Velocity

98.1 mph

Spin Eff

86%

IVB

11.8 in

Pitch B | Slider1958.7 RPM

Velocity

85.5 mph

Spin Eff

83.6%

IVB

1.6 in

Quick Comparison

Each card compares Pitch A vs Pitch B for one metric. Delta is calculated as B - A. Velocity, IVB, and HB use the selected pitch movement profile; spin rate reflects raw RPM from the same selected pitch records.

Spin Rate

A2236.7 RPM

B1958.7 RPM

-278.0 RPM

Velocity

A98.1 mph

B85.5 mph

-12.6 mph

IVB

A11.8 in

B1.6 in

-10.2 in

HB

A2.9 in

B-0.6 in

-3.5 in

Star Schema Explorer

Conformed dimensional models for financial, commercial, manufacturing, and baseball analytics

Ask My AI Assistant

Powered by OpenAI — ask anything about my experience, skills, or projects

ai-assistant — gpt-4o-mini

Try one of these to get started:

Let's Connect

I'm always open to discussing data engineering, analytics, or new opportunities.

Download Resume