Scene-aware and Social-aware Motion Prediction for Autonomous Driving

Research Project · Technical University of Munich · 2023

Motion prediction models convert between position, velocity, and acceleration using ballistic integration. The problem: deriving acceleration from two different formulas gives two different answers. A linear regression approach closes that gap by a factor of 100 million.

The Problem

Self-driving cars must predict how surrounding vehicles will move. Neural networks handle this prediction, but they rely on discrete integration to convert between position, velocity, and acceleration at each timestep.

Standard ballistic integration uses two formulas: one updates position from velocity and acceleration, the other updates velocity from acceleration. Both formulas should agree on the acceleration value. They don't. When you solve each formula for acceleration independently, you get different results. This mathematical inconsistency propagates through every prediction the network makes.

Before this inconsistency can be addressed, the training data itself needs work. Raw drone-captured trajectory datasets contain thousands of vehicles, most traveling in straight lines with no meaningful interactions. The models need filtered examples of social driving behavior: merging, yielding, lane changing, overtaking.

The Approach

Filtering Module Data Preparation

Three drone-recorded traffic datasets from RWTH Aachen provided raw vehicle trajectories: inD (intersections), exiD (highway exits and entries), and rounD (roundabouts). These datasets capture real German traffic from an overhead perspective, but most recorded vehicles simply drive straight without interacting.

The filtering module processes these trajectories in three stages. First, preprocessing normalizes vehicle positions to a common reference frame. Then, micro-behavior detection identifies atomic driving actions: speed adjustments, hard braking events, same-lane proximity, and distance stability patterns. Finally, macro-behavior synthesis combines these signals to classify high-level interactions: entering, exiting, lane changing, overtaking, yielding, merging, and speed adjustment. Each classification uses threshold-based conditions on velocity changes and relative distances between vehicle pairs.

Integration Model Consistent Discrete Integration

Standard ballistic integration computes the next position as s(k+1) = s(k) + dt·v(k) + (dt²/2)·a(k), and the next velocity as v(k+1) = v(k) + dt·a(k). Solving both for acceleration a(k) yields two different values. This is the inconsistency at the core of the problem.

The replacement: two linear regression models, each trained to predict acceleration. The distance model derives acceleration from position changes. The velocity model derives acceleration from velocity changes. The key insight is that both models train on the same ground-truth acceleration data, forcing them to produce matching values.

The learned coefficients are then rearranged into consistent distance and velocity prediction formulas. Because both formulas were derived from the same acceleration training, they remain mathematically aligned, unlike their ballistic integration counterparts.

Results

Both linear models achieve near-identical acceleration prediction, with R-squared values of 0.9892. But the real test is acceleration equivalence: do the two formulas agree on the same acceleration?

Method	MSE	MAE
Linear Model	1.81 × 10⁻⁶	7.51 × 10⁻⁴
Ballistic Integration	2.07 × 10²	5.02 × 10⁻¹

Acceleration equivalence: MSE between accelerations derived from the distance formula vs. the velocity formula. Lower is better.

The linear model's acceleration equivalence MSE is 0.00000181. Ballistic integration's is 206.98. That is an improvement of over one hundred million times, reducing the mathematical inconsistency between the two formulas to near zero.

Acceleration equivalence (log scale)

Ballistic

MSE: 2.07e+02

Linear Model

MSE: 1.81e-06

Logarithmic scale. The linear model (highlighted) achieves eight orders of magnitude lower error than ballistic integration.

Acceleration prediction (individual models)

Model	MSE	MAE	R²
Distance Model	3.54 × 10⁻³	1.19 × 10⁻²	0.9892
Velocity Model	3.54 × 10⁻³	1.19 × 10⁻²	0.9892

Both models achieve nearly identical performance, confirming that the shared training produces consistent acceleration estimates.

Distance and velocity prediction

Formula	MSE	MAE	R²
Distance	13.867	3.724	0.989
Velocity	2.264	1.505	0.934

Key Findings

Training two linear models on the same acceleration data forces mathematical consistency between position-based and velocity-based predictions, eliminating the fundamental flaw in ballistic integration.
The linear regression approach reduces acceleration equivalence error by eight orders of magnitude (MSE from 206.98 to 1.81 × 10⁻⁶), while maintaining prediction R-squared values above 0.93.
Filtering raw trajectory data to isolate meaningful social interactions (merging, yielding, overtaking) is essential for training motion prediction models on real-world driving behavior.
Threshold-based micro-behavior detection, combined into macro-behavior synthesis, provides an interpretable and effective pipeline for extracting social driving scenarios from drone-captured datasets.
The approach uses only linear regression, avoiding the complexity of deep learning for the integration step itself. The simplicity of the method is part of its strength: fewer parameters, fewer failure modes, and a clear mathematical guarantee of consistency.