Atlas — AI-Driven Diagnostics Data Mining & Triaging
Atlas replaced cumbersome event-by-event diagnostics analysis with trend-driven clarity at scale — a platform I ideated, built, and deployed across Ford to power internal AI and prognostics initiatives, giving developers, testers, architects, managers, and planners the foundation to see, understand, and act on their defect landscape.
The Problem
Vehicle diagnostics generate enormous volumes of data — rich logs from multiple onboard computers per event, across millions of events across Ford's platforms. The existing approach forced engineers to triage one event at a time, a process that was slow, manual, and blind to patterns. Everyone from developers to planners needed answers, but the tooling didn't scale and the insights were locked behind cumbersome workflows.
The Solution
Atlas is now used across the organization by developers, architects, managers, testers, and planners — democratizing diagnostics intelligence that was previously locked behind manual, expert-driven workflows. Engineers see a clear picture of which issues need resolution, which are already addressed, and when those fixes will reach vehicles in the field. Beyond immediate triage, Atlas serves as a foundation for Ford's internal AI and prognostics initiatives — its structured, trend-aware data pipeline feeds downstream models that predict emerging defect patterns and inform proactive quality decisions before issues reach the field at scale.
The Process
Discovery
Problem & Data Landscape
Identified that engineers across roles — developers, testers, architects, managers, planners — were all stuck in the same cumbersome loop: manually inspecting diagnostics events one at a time across thousands of data points from rich vehicle logs spanning multiple onboard computers.
Data source inventoryTriage workflow analysisStakeholder interviewsArchitecture
Modular Pipeline Design
Designed a modular architecture with massively parallel data-fetching mechanisms and classical ML algorithms — deliberately avoiding blanket LLM usage where processing millions of tokens from vehicle logs would be prohibitively expensive. The focus was on trend visualization and pattern matching, not event-by-event inspection.
Pipeline architectureML model specificationsCost analysisShip
Organization-Wide Deployment
Deployed Atlas internally with broad adoption across the organization — developers, architects, managers, testers, and planners all use it to visualize trends, understand the defect landscape, and plan burndown with clarity on what's resolved and when fixes reach the field.
Production deploymentTriage dashboardDefect burndown reports
Architecture
Data Sources
Vehicle Logs & Defect DBs
Parallel Fetcher
Async Concurrent Retrieval
ML Triage Engine
Classification & Matching
Trend Dashboard
Visualization, Fixes & Field Timeline
Data Sources
Vehicle Logs & Defect DBs
Parallel Fetcher
Async Concurrent Retrieval
ML Triage Engine
Classification & Matching
Trend Dashboard
Visualization, Fixes & Field Timeline
How It Works
Fetch
Massively parallelized mechanisms retrieve thousands of diagnostics data points concurrently from multiple sources — rich vehicle logs spanning multiple onboard computers per event, across millions of events.
Analyze
Classical ML algorithms match incoming data against known defects and addressed fixes. No blanket LLM usage — the approach is cost-conscious and purpose-built for the scale and structure of vehicle diagnostics data.
Visualize
Surfaces trends rather than individual events — providing a clear picture of what needs resolution, what's already fixed, and when those fixes reach vehicles in the field, enabling planning and burndown across vehicle programs.
Techniques
- ✓Classical Machine Learning
- ✓Parallel Data Fetching
- ✓Automated Triage
- ✓Trend Visualization
- ✓Defect Pattern Matching
- ✓Field Fix Correlation
Technologies
- ✓Python
- ✓Scikit-learn
- ✓Pandas
- ✓Async I/O
- ✓REST APIs