Point Cloud Registration using Graph Neural Networks
A six-month survey of how graph neural networks are reshaping point cloud registration for autonomous vehicle localization, mapping the landscape from classical ICP to modern deep learning approaches across 50+ papers and 7 benchmark datasets.
The Question
Self-driving cars need to know exactly where they are. GPS alone is not precise enough. The standard approach: take a 3D LiDAR scan of the surroundings and align it against a pre-built HD map. This alignment is point cloud registration, a fundamental problem in computer vision: finding the rigid transformation (rotation and translation) that maps one 3D point cloud onto another.
Classical methods like Iterative Closest Point (ICP) work, but have a fundamental weakness: they get stuck in local minima and require a good initial alignment. When GPS provides that starting guess, and the guess is off, ICP fails. The question becomes whether graph neural networks offer a better path. By modeling point clouds as graphs and learning both vertex features and edge relationships, GNNs can capture geometric structure that purely point-based or iterative methods miss.
The Approach
Reviewed 50+ papers to build a comprehensive taxonomy of point cloud registration methods. The field splits into two broad categories: same-source registration (both point clouds from the same sensor type) and cross-source registration (different sensors or modalities). Same-source methods further divide into optimization-based approaches (ICP, graph matching, Gaussian mixture models, semi-definite programming), feature-learning methods (volumetric and point cloud descriptors), and end-to-end learning (regression and neural-network-guided optimization).
Key methods analyzed in depth: ICP and its variants, Coherent Point Drift (CPD), graph matching approaches (RGM), Gaussian mixture models (DeepGMR), Deep Global Registration (DGR), and correspondence-less methods (DeepCLR). Each represents a distinct philosophy for solving the alignment problem, from iterative point matching to probabilistic distribution alignment to learned geometric descriptors.
The second phase narrowed focus to how point cloud registration applies to vehicle localization in autonomous driving. The driving pipeline has distinct stages: sensing, perception, prediction, planning, and control, with map creation and localization running as parallel components. For localization specifically, the available sensor modalities include cameras, IMUs (prone to drift over time), GNSS (requires clear sky, limited in urban canyons), and LiDAR (most precise for 3D spatial data). Practical systems combine multiple sources: visual-inertial, visual-laser, and radar-inertial fusion.
Analyzed seven major autonomous driving benchmark datasets across their sensor coverage, annotation quality, and adoption. Cityscapes leads in total citations (7,840+), but KITTI remains the standard benchmark for point cloud registration and odometry tasks. Across the datasets surveyed, 79% include point cloud maps and 71% include odometry information, confirming that LiDAR-based 3D data is central to the field. The top-performing methods on the KITTI odometry leaderboard (MULLS, TVL-SLAM+, CT-ICP) all operate on Velodyne LiDAR data.
The final phase involved working with the Robot Operating System (ROS) in the context of motion planning. Learned ROS from scratch to gain practical experience with the middleware that connects perception, localization, and planning in real robotic systems. This grounded the theoretical survey in the engineering realities of deploying point cloud registration on actual autonomous platforms.
Results
Localization methods comparison
| Method | Approach | Key Limitation |
|---|---|---|
| ICP | Iterative closest point matching | Local minima, needs good initialization |
| CPD | Probabilistic point drift | High computational cost |
| Graph Matching | Vertex and edge correspondences | Optimization complexity |
| DeepGMR | Gaussian mixture alignment | Fixed number of Gaussians |
| DGR | Learned geometric descriptors | Training data dependency |
| DeepCLR | Correspondence-less end-to-end | Architecture complexity |
Methods span the spectrum from classical optimization to fully learned approaches. ICP (highlighted) remains widely used despite its initialization sensitivity.
Benchmark dataset citations
Dataset feature coverage
| Feature | Datasets With | Datasets Without |
|---|---|---|
| Point Cloud Maps | 79% | 21% |
| Odometry | 71% | 29% |
Across seven major autonomous driving datasets. The high prevalence of point cloud data confirms LiDAR as the primary modality for 3D localization research.
Autonomous driving localization pipeline
Key Findings
- Classical ICP methods remain widely used but are fundamentally limited by sensitivity to initial alignment, precisely the scenario where GPS-based initialization is imprecise and urban environments create signal occlusion.
- Graph neural networks capture both vertex features and edge relationships in point clouds, offering structural understanding that purely point-based methods miss. This graph-level reasoning is well-suited to the inherently sparse, irregular geometry of LiDAR scans.
- Feature-learning approaches (DGR, DeepCLR) represent the most active research direction, extracting local geometric descriptors as correspondences before estimating rigid transformations. These methods decouple feature extraction from pose estimation, improving modularity and generalization.
- GMM-based methods like DeepGMR provide noise and outlier robustness by aligning probability distributions rather than individual points, an important property for real-world LiDAR data with occlusions and varying point densities.
- KITTI remains the dominant benchmark for vehicle localization, but Cityscapes leads in total citations (7,840+). Across the seven major datasets surveyed, 79% include point cloud maps, confirming LiDAR as the central modality for 3D localization research.
- The best-performing methods on the KITTI odometry leaderboard (MULLS, TVL-SLAM+, CT-ICP) all use Velodyne LiDAR data, reinforcing that precise 3D localization in autonomous driving is currently a LiDAR-first problem.