Why We Built Our Model This Way
Empirical Evidence for Component-Based Risk Assessment
Published: February 2026 • Study Period: 2024-2025
The Question
What actually predicts which motor carriers will crash? Rather than relying on industry assumptions, we tested each potential predictor empirically using temporal validation on 242,969 graded carriers.
We measured Year 1 factors (2024) and checked if they predicted Year 2 outcomes (2025), controlling for fleet size using Empirical Bayes shrinkage within fleet-size bands. This approach isolates true predictive signal from fleet-size confounding and revealed some surprising findings that challenged conventional wisdom.
Our Model Formula:
Each weight is tuned for both predictive power (AUC) and size-fairness. Crash history is the dominant predictor, followed by behavioral violations, equipment condition, and severity indicators. All rates are Empirical Bayes-adjusted and normalized per 100,000 miles.
Do Grades Predict Future Crashes?
Year 2 (2025) Crash Rates by Safety Grade
242,969 graded carriers • EB-adjusted rates per 100k miles
Carriers graded Critical in Year 1 were 3.35× more likely to crash in Year 2 than Excellent carriers, after controlling for fleet size via EB adjustment. 9.7% of Excellent carriers crashed, compared to 18.7% of Critical carriers.
Left number: EB-adjusted crash rate per 100k miles. Right number: % of carriers with any Year 2 crash.
What We Tested and What We Found
Our Methodology
For each potential predictor, we:
- 1. Measured Year 1 (2024): Recorded the predictor value for each carrier
- 2. Measured Year 2 (2025): Recorded their actual crash rate (per 100k miles)
- 3. Quintile Analysis: Divided carriers into 5 groups by predictor value
- 4. Calculated Relative Risk: Worst Quintile ÷ Best Quintile crash rates
Critical: All rates normalized per 100k miles and Empirical Bayes adjusted to account for fleet size. This ensures we're measuring true predictive power, not just correlation with fleet size.
1. Crash Rate (EB-Adjusted)
Dominant Predictor → 56% Weight
Past crashes are the single strongest predictor of future crashes. Carriers with elevated crash rates in Year 1 consistently had elevated rates in Year 2. This signal is robust across all fleet sizes and carrier ages.
Why 56%? Crash rates are the most direct measure of what we're trying to predict. We use Empirical Bayes shrinkage (Gamma-Poisson model) to stabilize rates for small carriers — a single crash doesn't doom a carrier, because we blend observed rates toward the fleet-size peer group average. This means the signal is reliable for both 3-truck operations and 3,000-truck fleets.
2. Behavioral Violations
Strong Discriminator → 18% Weight
Driver decision violations — speeding, reckless driving, HOS violations, drug and alcohol offenses — are the strongest leading indicator among violation types. These capture the culture and discipline of a carrier's operation before crashes actually happen.
Risk-Weighted Violation Types:
Cumulative Behavioral Dose-Response (EB-Adjusted):
% of carriers with any Year 2 crash, by number of distinct behavioral violation types in Year 1. EB RR is size-band normalized.
Why 18%? Behavioral violations are the best leading indicator — they reveal risk before crashes happen. A carrier whose drivers regularly speed or drive fatigued is fundamentally higher risk. Each violation type is weighted by its empirically-measured relative risk contribution from the Violation Types study.
3. Equipment Violations
Maintenance Signal → 14% Weight
Equipment violations — brakes, tires, lighting, cargo securement — reflect a carrier's maintenance standards and operational discipline. Unlike behavioral violations (driver choices), equipment condition reveals systemic quality.
Why 14%? Equipment violations are a moderate but consistent predictor. A carrier that regularly fails brake inspections has systemic maintenance issues. Equipment violations are less predictive than behavioral violations (23% lower relative risk on average) but remain an important independent signal of operational quality.
4. Severity Indicators
Extreme Risk Flags → 12% Weight
The severity component captures the most dangerous violations — those with FMCSA severity weight ≥ 7. These critical violations are EB-adjusted per 100k miles within fleet-size bands, just like the other components. Carriers with severe violations naturally produce elevated peer indices through the weighted composite.
Critical Behavioral Flags:
These violations are flagged on carrier records and feed into the severity component's EB-adjusted rate. The FRED pipeline handles them through the 4-component peer-relative model rather than hard score caps:
Counted in the severe component — carriers with these violations have significantly elevated peer indices
Substance violations contribute to both behavioral and severe components — double signal amplifies impact
Carriers with 10+ speeding violations show 98.4% crash probability — naturally produces elevated peer index via EB-adjusted rates
Why 12%? Severity captures tail risk that other components might miss. A carrier with critical violations (severity weight ≥ 7) represents a qualitatively different risk profile. The EB-adjusted severe rate ensures this signal is reliable across fleet sizes while avoiding disproportionate penalization of small carriers for isolated incidents.
Supporting Evidence: The Carrier Age Effect
Why Empirical Bayes Matters
Carrier age shows a moderate risk gradient (about 1.37× peak-to-low) captured through EB shrinkage, not a separate weight
Crash Rate by Years in Operation
Click fleet size buttons to compare different carrier segments
Carrier age remains relevant, but the effect is more modest in this cohort: the highest rates are among the newest carriers, and the lowest rates are in the 10-19 year range after normalizing for fleet size and exposure.
Rather than giving age its own weight in the formula, our Empirical Bayes shrinkage naturally handles this effect. New carriers with limited data get pulled toward their peer-group average (which includes the age-related risk), while mature carriers with extensive records keep their observed rates. This is more principled than adding experience as an arbitrary weighted factor.
FMCSA's BASIC Score Problem: BASIC treats all carriers equally regardless of age. A 20-year carrier with 2 crashes gets the same treatment as a 1-year carrier with 2 crashes. When we tested BASIC scores as a predictor, they showed inverse correlation (RR=0.17×) — carriers with better BASIC scores had higher crash rates, likely because established carriers accumulate more inspection history. We don't use BASIC in our model.
Supporting Evidence: The Fleet Size Effect
Small Carriers Have Higher Crash Rates Across All Ages
The interactive chart above reveals a consistent pattern: smaller fleets have higher crash rates regardless of experience. This is why we normalize per 100k miles and apply Empirical Bayes adjustment by fleet-size peer group — it controls for this effect rather than penalizing small carriers arbitrarily.
Crash Rates by Fleet Size (1-2 Year Carriers)
Small carriers in their first two years have crash rates 4.3× higher than enterprise carriers of the same age.
How Our Model Handles This
Empirical Bayes by size band: Each fleet-size peer group (small, medium, large, enterprise) has its own prior. A small carrier's rate is shrunk toward the small-carrier average, not the overall fleet average.
Peer-relative grading: Our safety grades compare carriers to others of the same size. A Peer Index of 1.0 means "exactly average for your size band." This prevents small carriers from being automatically graded worse simply for being small.
Size-band expected loss: Expected crash events are computed using the size-band baseline crash rate scaled by the carrier's peer index, so a given score translates to the same risk meaning regardless of fleet size.
What Predicts Risk for Mature Carriers?
The Predictive Hierarchy Shifts After 10 Years
Once carriers reach 10+ years, operational performance metrics dominate. Crash history becomes the primary differentiator, while behavioral violations remain the best leading indicator of emerging risk.
Behavioral vs Equipment Violations (10+ Year Carriers)
When normalized per 100,000 miles to control for fleet size, behavioral violations emerge as the stronger predictor:
Speeding, reckless driving, HOS, drugs/alcohol
Worst quintile: 0.381 crashes/100k mi
Brakes, tires, lights, cargo securement
Worst quintile: 0.312 crashes/100k mi
Key insight: For mature carriers, behavioral violations are 23% more predictive than equipment violations. This validates our model's separation of behavioral (18%) and equipment (14%) as distinct components rather than lumping all violations together.
Implications for Insurance Underwriting
For new carriers (<10 years): The EB shrinkage toward peer-group priors provides natural conservatism. Limited data means the score reflects the peer average more than individual history.
For mature carriers (10+ years): Crash history carries the most weight because these carriers have enough data for reliable rate estimation. Behavioral violations provide the best early warning of deteriorating safety culture before crashes materialize.
Grade Distribution
Grades are assigned by peer index — a weighted composite of how each safety component (crash, behavioral, equipment, severity) compares to same-size peers. A peer index of 1.0 means the carrier performs exactly at its peer average; below 1.0 is better, above is worse. This peer-relative approach ensures small and large fleets are compared fairly.
242,969 graded carriers in the study population. The distribution peaks at Excellent (46.1%), with a right-skewed tail reflecting that most carriers have zero or very few crashes. Each grade card shows the Year 2 crash rate — the percentage of carriers in that grade who experienced at least one crash in 2025. Critical carriers crash at 1.93× the rate of Excellent carriers.
Key Takeaways
1. Crash History Is the Dominant Predictor (56%)
Past crashes predict future crashes more reliably than any other signal. Empirical Bayes shrinkage makes this robust for all fleet sizes — small carriers aren't penalized by statistical noise, and large carriers keep their reliable observed rates.
2. Behavioral Violations Are the Best Leading Indicator (18%)
Driver decision violations (speeding, reckless driving, substances, HOS fraud) are the strongest leading indicator of future crashes. They capture risk before it materializes. For mature carriers, behavioral violations are 23% more predictive than equipment violations.
3. Equipment Condition Reflects Systemic Quality (14%)
Brake failures, tire issues, and lighting problems signal maintenance standards and operational discipline. Less predictive than behavioral violations, but an independent and consistent signal.
4. Severity Captures Tail Risk (12%)
The most dangerous violations — reckless driving, substance offenses, extreme speeding — carry high severity weights that feed into the EB-adjusted severe component rate. Carriers with these violations naturally produce elevated peer indices through the 4-component model, catching qualitatively different risk that crash history alone might miss.
5. Fleet Size Is Controlled, Not Penalized
Small carriers (1-5 trucks) have crash rates 2-4× higher than enterprise carriers, but our peer-relative grading and size-band EB priors ensure carriers are compared to peers of similar size. A small carrier with clean records can still earn an Excellent grade.