Scene-aware and Social-aware Motion Prediction for Autonomous Driving
Motion prediction models convert between position, velocity, and acceleration using ballistic integration. The problem: deriving acceleration from two different formulas gives two different answers. A linear regression approach closes that gap by a factor of 100 million.
The Problem
Self-driving cars must predict how surrounding vehicles will move. Neural networks handle this prediction, but they rely on discrete integration to convert between position, velocity, and acceleration at each timestep.
Standard ballistic integration uses two formulas: one updates position from velocity and acceleration, the other updates velocity from acceleration. Both formulas should agree on the acceleration value. They don't. When you solve each formula for acceleration independently, you get different results. This mathematical inconsistency propagates through every prediction the network makes.
Before this inconsistency can be addressed, the training data itself needs work. Raw drone-captured trajectory datasets contain thousands of vehicles, most traveling in straight lines with no meaningful interactions. The models need filtered examples of social driving behavior: merging, yielding, lane changing, overtaking.
The Approach
Three drone-recorded traffic datasets from RWTH Aachen provided raw vehicle trajectories: inD (intersections), exiD (highway exits and entries), and rounD (roundabouts). These datasets capture real German traffic from an overhead perspective, but most recorded vehicles simply drive straight without interacting.
The filtering module processes these trajectories in three stages. First, preprocessing normalizes vehicle positions to a common reference frame. Then, micro-behavior detection identifies atomic driving actions: speed adjustments, hard braking events, same-lane proximity, and distance stability patterns. Finally, macro-behavior synthesis combines these signals to classify high-level interactions: entering, exiting, lane changing, overtaking, yielding, merging, and speed adjustment. Each classification uses threshold-based conditions on velocity changes and relative distances between vehicle pairs.
Standard ballistic integration computes the next position as s(k+1) = s(k) + dt·v(k) + (dt²/2)·a(k), and the next velocity as v(k+1) = v(k) + dt·a(k). Solving both for acceleration a(k) yields two different values. This is the inconsistency at the core of the problem.
The replacement: two linear regression models, each trained to predict acceleration. The distance model derives acceleration from position changes. The velocity model derives acceleration from velocity changes. The key insight is that both models train on the same ground-truth acceleration data, forcing them to produce matching values.
The learned coefficients are then rearranged into consistent distance and velocity prediction formulas. Because both formulas were derived from the same acceleration training, they remain mathematically aligned, unlike their ballistic integration counterparts.
Results
Both linear models achieve near-identical acceleration prediction, with R-squared values of 0.9892. But the real test is acceleration equivalence: do the two formulas agree on the same acceleration?
| Method | MSE | MAE |
|---|---|---|
| Linear Model | 1.81 × 10⁻⁶ | 7.51 × 10⁻⁴ |
| Ballistic Integration | 2.07 × 10² | 5.02 × 10⁻¹ |
Acceleration equivalence: MSE between accelerations derived from the distance formula vs. the velocity formula. Lower is better.
The linear model's acceleration equivalence MSE is 0.00000181. Ballistic integration's is 206.98. That is an improvement of over one hundred million times, reducing the mathematical inconsistency between the two formulas to near zero.
Acceleration equivalence (log scale)
Acceleration prediction (individual models)
| Model | MSE | MAE | R² |
|---|---|---|---|
| Distance Model | 3.54 × 10⁻³ | 1.19 × 10⁻² | 0.9892 |
| Velocity Model | 3.54 × 10⁻³ | 1.19 × 10⁻² | 0.9892 |
Both models achieve nearly identical performance, confirming that the shared training produces consistent acceleration estimates.
Distance and velocity prediction
| Formula | MSE | MAE | R² |
|---|---|---|---|
| Distance | 13.867 | 3.724 | 0.989 |
| Velocity | 2.264 | 1.505 | 0.934 |
Key Findings
- Training two linear models on the same acceleration data forces mathematical consistency between position-based and velocity-based predictions, eliminating the fundamental flaw in ballistic integration.
- The linear regression approach reduces acceleration equivalence error by eight orders of magnitude (MSE from 206.98 to 1.81 × 10⁻⁶), while maintaining prediction R-squared values above 0.93.
- Filtering raw trajectory data to isolate meaningful social interactions (merging, yielding, overtaking) is essential for training motion prediction models on real-world driving behavior.
- Threshold-based micro-behavior detection, combined into macro-behavior synthesis, provides an interpretable and effective pipeline for extracting social driving scenarios from drone-captured datasets.
- The approach uses only linear regression, avoiding the complexity of deep learning for the integration step itself. The simplicity of the method is part of its strength: fewer parameters, fewer failure modes, and a clear mathematical guarantee of consistency.